Heterogeneous DataFrame by Kriyszig · Pull Request #3 · Kriyszig/magpie

Kriyszig · 2019-06-07T02:28:39Z

Issue in focus: #2
WIP: #1

This PR brings heterogeneous DataFrame support to Magpie. The complete details has been added to the README

* Redesigned all the major function to accomodate heterogenwous DataFrame * from_csv() still in development - many optimizations possible * Basic tests are passing

* Rewrote the DataFrame declaration template - removed assignment until it's formally added to the API * Rewrote the structure definition * Added dispStr params for display function

* Initialize DeataFrame using Fields!S in O(log(n)) time * Added unittests for te above feature * Added unittests for partial parsing of DataFrame

Assign and access to values based on their direct indexes.

Added initial base to assign the element beased on the string labels

* Added a new method to optimize indexes * Added setter for Index * Optimized indexes after parsing by triggering optimize()

* Added setting of DataFrame index * Bugfix in index based assignment op

* Index.optimize() didn't recreate the complete index array before checking integer casting. * Added a public dataset to check stability of the parser.

* Added BSL-1.0 as license in dub.json * Added a second dataset to show errorless parsing of csv with gaps * Added a total assignment op with 2D arrays

* Added function to extend indexes if user needs to add more indexes down the line * Unittests to verify correct behavior * Small optimizations here n there.

* Added method to assign values to rows and columns of DataFrame * Fixed minor bug in Index * Revamped README.md

* Index is an array of struct of size 2 instead of 6 seperate variables 0 - rows, 1 - columns * Moved getArgsList to helper.d

Code size can be reduced after new indexing strucutre was adopted.

* Array like access to elements * Minor optimization

* Documented index operation

* Get an entire column from string index

* Added Axis struct to get row/column from DataFrame * Bianry Ops on Axis will turn to row/column binary operation on DataFrame

* This would be necessary for column binary ops

thewilsonator · 2019-06-10T01:26:02Z

-    /// Codes to map the index of above column to their position
-    int[][] ccodes = [];
+    /// Row and Column indexing
+    Indexing[2] indexing;


add

ref row() { return indexing[0]; } ref column() { return indexing[1]; }

and use those in place of indexing[0] and indexing[1]

Addressed in 2b3a9b1

Thanks. Much easier to read.

* Using Index.row instead of Index.indexing[0] * Using Index.column instead of Index.indexing[1]

* Addition operation on Axis - both for row and column like

* Row anD Column binary subtraction

* Multiplication and Division of rows/columns

Kriyszig added 18 commits June 2, 2019 22:19

REDESIGN: Accomdate Heterogeneous DataFrame

0c15adb

* Redesigned all the major function to accomodate heterogenwous DataFrame * from_csv() still in development - many optimizations possible * Basic tests are passing

Updated README with the changes in API and its usage

a106607

* Rewrote the DataFrame declaration template - removed assignment until it's formally added to the API * Rewrote the structure definition * Added dispStr params for display function

ENHANCEMENT: New way to initialize a DataFrame using structure

51b8cb1

* Initialize DeataFrame using Fields!S in O(log(n)) time * Added unittests for te above feature * Added unittests for partial parsing of DataFrame

INDEX OPERATION: Primary index operation to assign and access values

63f453b

Assign and access to values based on their direct indexes.

INDEX OP: Initial support to access data using the labels

32141be

Added initial base to assign the element beased on the string labels

Added setIndex() for Index. Optimized indexes after CSV parsing.

175a096

* Added a new method to optimize indexes * Added setter for Index * Optimized indexes after parsing by triggering optimize()

ENHANCEMENT + BUGFIX: Fixed a bug in assignment based on string index.

1803044

* Added setting of DataFrame index * Bugfix in index based assignment op

Bugfix + Unittest: Fixed Index.optimize(). Added from_csv() unittest

e767ce0

* Index.optimize() didn't recreate the complete index array before checking integer casting. * Added a public dataset to check stability of the parser.

Licensing, Parsing example & assignment op

d48c022

* Added BSL-1.0 as license in dub.json * Added a second dataset to show errorless parsing of csv with gaps * Added a total assignment op with 2D arrays

Extend index

9c4fdc6

* Added function to extend indexes if user needs to add more indexes down the line * Unittests to verify correct behavior * Small optimizations here n there.

assign(axis)() and revamped documentation

b65490e

* Added method to assign values to rows and columns of DataFrame * Fixed minor bug in Index * Revamped README.md

Fixed errors in documentation

16e4b9a

Index redesign - Array of struct

0a25613

* Index is an array of struct of size 2 instead of 6 seperate variables 0 - rows, 1 - columns * Moved getArgsList to helper.d

Minimized code repetition

57feedf

Code size can be reduced after new indexing strucutre was adopted.

opIndex based on integeral and string indexes

e7dee19

* Array like access to elements * Minor optimization

Update documentation and eroor messages

0b02058

* Documented index operation

Getting an entire column from string index

894bb9b

* Get an entire column from string index

Returning an entire row/column as Axis struct

0e8b879

* Added Axis struct to get row/column from DataFrame * Bianry Ops on Axis will turn to row/column binary operation on DataFrame

Kriyszig force-pushed the experimental branch from c960e7c to 0e8b879 Compare June 9, 2019 11:55

convertTo for Axis to convert type of Axis.data

5fc071c

* This would be necessary for column binary ops

thewilsonator reviewed Jun 10, 2019

View reviewed changes

Kriyszig added 4 commits June 10, 2019 09:53

Adding ref for row and column indexes

2b3a9b1

* Using Index.row instead of Index.indexing[0] * Using Index.column instead of Index.indexing[1]

Addition Op on Axis

faf54b2

* Addition operation on Axis - both for row and column like

Subtraction Op for Axis

4cbe909

* Row anD Column binary subtraction

Multiplication and Division Op on Axis

34d2e6d

* Multiplication and Division of rows/columns

Kriyszig merged commit 0d8c6ad into master Jun 13, 2019

Kriyszig deleted the experimental branch June 13, 2019 10:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heterogeneous DataFrame#3

Heterogeneous DataFrame#3
Kriyszig merged 23 commits into
masterfrom
experimental

Kriyszig commented Jun 7, 2019 •

edited

Loading

Uh oh!

thewilsonator Jun 10, 2019

Uh oh!

Kriyszig Jun 10, 2019

Uh oh!

thewilsonator Jun 10, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Kriyszig commented Jun 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thewilsonator Jun 10, 2019

Choose a reason for hiding this comment

Uh oh!

Kriyszig Jun 10, 2019

Choose a reason for hiding this comment

Uh oh!

thewilsonator Jun 10, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Kriyszig commented Jun 7, 2019 •

edited

Loading