Heterogeneous DataFrame#3
Merged
Merged
Conversation
* Redesigned all the major function to accomodate heterogenwous DataFrame * from_csv() still in development - many optimizations possible * Basic tests are passing
* Rewrote the DataFrame declaration template - removed assignment until it's formally added to the API * Rewrote the structure definition * Added dispStr params for display function
* Initialize DeataFrame using Fields!S in O(log(n)) time * Added unittests for te above feature * Added unittests for partial parsing of DataFrame
Assign and access to values based on their direct indexes.
Added initial base to assign the element beased on the string labels
* Added a new method to optimize indexes * Added setter for Index * Optimized indexes after parsing by triggering optimize()
* Added setting of DataFrame index * Bugfix in index based assignment op
* Index.optimize() didn't recreate the complete index array before checking integer casting. * Added a public dataset to check stability of the parser.
* Added BSL-1.0 as license in dub.json * Added a second dataset to show errorless parsing of csv with gaps * Added a total assignment op with 2D arrays
* Added function to extend indexes if user needs to add more indexes down the line * Unittests to verify correct behavior * Small optimizations here n there.
* Added method to assign values to rows and columns of DataFrame * Fixed minor bug in Index * Revamped README.md
* Index is an array of struct of size 2 instead of 6 seperate variables 0 - rows, 1 - columns * Moved getArgsList to helper.d
Code size can be reduced after new indexing strucutre was adopted.
* Array like access to elements * Minor optimization
* Documented index operation
* Get an entire column from string index
* Added Axis struct to get row/column from DataFrame * Bianry Ops on Axis will turn to row/column binary operation on DataFrame
* This would be necessary for column binary ops
| /// Codes to map the index of above column to their position | ||
| int[][] ccodes = []; | ||
| /// Row and Column indexing | ||
| Indexing[2] indexing; |
There was a problem hiding this comment.
add
ref row() { return indexing[0]; }
ref column() { return indexing[1]; }
and use those in place of indexing[0] and indexing[1]
* Using Index.row instead of Index.indexing[0] * Using Index.column instead of Index.indexing[1]
* Addition operation on Axis - both for row and column like
* Row anD Column binary subtraction
* Multiplication and Division of rows/columns
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue in focus: #2
WIP: #1
This PR brings heterogeneous DataFrame support to Magpie. The complete details has been added to the README