-
Notifications
You must be signed in to change notification settings - Fork 124
Description
There are still some unicode related issues in 1.6.2. I should have spotted them sooner,
sorry for that. The most problematic one is caused by urllib.urlencode
handling unicode objects the wrong way. I wasn't aware of this issue.
Personally, I would make sure that internally, you only have unicode objects,
i.e. that SPARQLWrapper.setQuery and similar methods convert str objects to
unicode objects. Then, decode them before applying urlencode and assembling
the HTTP request. I hope 2to3 is then able to apply the correct transformations.
Python 2.7.6 (default, Mar 22 2014, 15:40:47)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from SPARQLWrapper import SPARQLWrapper, XML, POST, GET, URLENCODED, POSTDIRECTLY
/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py:100: RuntimeWarning: JSON-LD disabled because no suitable support has been found
warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)
>>> uquery = u'INSERT DATA { <urn:michel> <urn:says> "é" }'
>>> query = uquery.encode('UTF-8')
>>> uquery
u'INSERT DATA { <urn:michel> <urn:says> "\xe9" }'
>>> query
'INSERT DATA { <urn:michel> <urn:says> "\xc3\xa9" }'
>>> wrapper = SPARQLWrapper('http://localhost:3030/ukpp/sparql', 'http://localhost:3030/ukpp/update')
POSTDIRECTLY only works for unicode objects. Except for the unclear error
message, this is not necessarily wrong, because a SPARQL query is in Unicode
and the SPARQL protocol mandates UTF-8 as charset.
>>> wrapper.setMethod(POST)
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
<SPARQLWrapper.Wrapper.QueryResult object at 0x7f7513e5c450>
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(query)
>>> wrapper.query()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 515, in query
return QueryResult(self._query())
File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 483, in _query
request = self._createRequest()
File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 442, in _createRequest
request.data = self.queryString.encode('UTF-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 39: ordinal not in range(128)
When using URLENCODED, it doesn't work with a unicode object, because for
some reason, urllib.urlencode can't handle unicode objects correctly.
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 515, in query
return QueryResult(self._query())
File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 483, in _query
request = self._createRequest()
File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 448, in _createRequest
request.data = urllib.urlencode(parameters, True)
File "/usr/lib/python2.7/urllib.py", line 1357, in urlencode
l.append(k + '=' + quote_plus(str(elt)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 39: ordinal not in range(128)
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(query)
>>> wrapper.query()
<SPARQLWrapper.Wrapper.QueryResult object at 0x7f7513e5c290>
The same test with Python 3:
Python 3.4.1 (default, Jul 6 2014, 20:01:46)
[GCC 4.9.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from SPARQLWrapper import SPARQLWrapper, XML, POST, GET, URLENCODED, POSTDIRECTLY
/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py:100: RuntimeWarning: JSON-LD disabled because no suitable support has been found
warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)
>>> uquery = 'INSERT DATA { <urn:michel> <urn:says> "é" }'
>>> query = uquery.encode('UTF-8')
>>> uquery
'INSERT DATA { <urn:michel> <urn:says> "é" }'
>>> query
b'INSERT DATA { <urn:michel> <urn:says> "\xc3\xa9" }'
>>> wrapper = SPARQLWrapper('http://localhost:3030/ukpp/sparql', 'http://localhost:3030/ukpp/update')
>>> wrapper.setMethod(POST)
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
<SPARQLWrapper.Wrapper.QueryResult object at 0x7f8587444048>
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(query)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 319, in setQuery
self.queryType = self._parseQueryType(query)
File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 335, in _parseQueryType
query = re.sub(re.compile("#.*?\n" ), "" , query) # remove all occurance singleline comments (issue #32)
File "/usr/lib/python3.4/re.py", line 175, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: can't use a string pattern on a bytes-like object
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 515, in query
return QueryResult(self._query())
File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 485, in _query
response = urlopener(request)
File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 453, in open
req = meth(req)
File "/usr/lib/python3.4/urllib/request.py", line 1120, in do_request_
raise TypeError(msg)
TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(query)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 319, in setQuery
self.queryType = self._parseQueryType(query)
File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 335, in _parseQueryType
query = re.sub(re.compile("#.*?\n" ), "" , query) # remove all occurance singleline comments (issue #32)
File "/usr/lib/python3.4/re.py", line 175, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: can't use a string pattern on a bytes-like object