Skip to content

[ZEPPELIN-1255] Add cast to string in z.show() for Pandas DataFrame#1249

Closed
bustios wants to merge 2 commits into
apache:masterfrom
bustios:ZEPPELIN-1255
Closed

[ZEPPELIN-1255] Add cast to string in z.show() for Pandas DataFrame#1249
bustios wants to merge 2 commits into
apache:masterfrom
bustios:ZEPPELIN-1255

Conversation

@bustios
Copy link
Copy Markdown
Contributor

@bustios bustios commented Jul 30, 2016

What is this PR for?

Casting data types in Pandas DataFrame to string in z.show()

What type of PR is it?

Bug Fix

What is the Jira issue?

ZEPPELIN-1255

How should this be tested?

%python

import pandas as pd

df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', header=None)
df.columns=[1, 2, 3, 'PetalWidth', 'Name']
z.show(df)

%python.sql

SELECT * FROM df  LIMIT 10

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@felixcheung
Copy link
Copy Markdown
Member

looks good!

@felixcheung
Copy link
Copy Markdown
Member

could you add tests for this?

@bzz
Copy link
Copy Markdown
Member

bzz commented Jul 31, 2016

Looks great to me, @bustios could you add a simple test case that reproduce this case to the python/src/test/java/org/apache/zeppelin/python/PythonInterpreterPandasSqlTest.java ?

@bustios
Copy link
Copy Markdown
Contributor Author

bustios commented Aug 1, 2016

Tranks for the revision @felixcheung, @bzz. I have already added the test case.

I think it would be nice to show a column with the row number or index if the DataFrame had one.

@bzz
Copy link
Copy Markdown
Member

bzz commented Aug 1, 2016

Looks great to me.

I think it would be nice to show a column with the row number or index if the DataFrame had one.

@bustios This makes sense, please feel free to open a separate JIRA issue for such improvement. We just need to make sure that it's clearly distinguished and can not be confused by the user as a part of the dataset.

CI failure is due to flaky R integration tests, which are not related

Results :

Failed tests: 
  ZeppelinSparkClusterTest.sparkRTest:116 expected:<[[1] 3]> but was:<[<pre><code>Error in getSparkSession(): SparkSession not initialized
</code></pre>

<pre><code>Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'count' for signature '&quot;function&quot;'
</code></pre>]>

Tests run: 65, Failures: 1, Errors: 0, Skipped: 0

Merging if there is no further discussion.

@asfgit asfgit closed this in 6f867ce Aug 1, 2016
PhilippGrulich pushed a commit to SWC-SENSE/zeppelin that referenced this pull request Aug 8, 2016
### What is this PR for?
Casting data types in Pandas DataFrame to string in z.show()

### What type of PR is it?
Bug Fix

### What is the Jira issue?
[ZEPPELIN-1255](https://issues.apache.org/jira/browse/ZEPPELIN-1255)

### How should this be tested?
```
%python

import pandas as pd

df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', header=None)
df.columns=[1, 2, 3, 'PetalWidth', 'Name']
z.show(df)

%python.sql

SELECT * FROM df  LIMIT 10
```

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: paulbustios <pbustios@gmail.com>

Closes apache#1249 from bustios/ZEPPELIN-1255 and squashes the following commits:

82c1412 [paulbustios] Add test case for z.show() Pandas DataFrame
4a8c0a9 [paulbustios] [ZEPPELIN-1255] Add cast to string in z.show() for Pandas DataFrame
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants