Skip to content

Conversation

@pombredanne
Copy link
Member

Reference: #13
Reference: #14
Reported-by: Armijn Hemel @armijnhemel
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reference: #11
Reported-by: Armijn Hemel @armijnhemel
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reference: #14
Reported-by: Armijn Hemel @armijnhemel
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Member Author

@keshav-space can you help me there?

  1. we should test if a file is source code before running any of ctags and xgettext. There is https://github.com/nexB/typecode/blob/92feb7be3a87c1b541e7034c3f9797c96bc52305/src/typecode/contenttype.py#L683 for ty=his
  2. we may want to keep the tests expectations flexible for various versions of the backing tools as they all seem to behave differently.

@keshav-space
Copy link
Member

@keshav-space can you help me there?

  1. we should test if a file is source code before running any of ctags and xgettext. There is https://github.com/nexB/typecode/blob/92feb7be3a87c1b541e7034c3f9797c96bc52305/src/typecode/contenttype.py#L683 for ty=his
  2. we may want to keep the tests expectations flexible for various versions of the backing tools as they all seem to behave differently.

Ack, will add the necessary change in this PR.

Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
@keshav-space keshav-space force-pushed the clean-strings branch 2 times, most recently from 48914f0 to f3a9351 Compare March 19, 2024 11:28
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
@keshav-space keshav-space merged commit 002ead1 into main Mar 19, 2024
@keshav-space keshav-space deleted the clean-strings branch March 19, 2024 13:33
@armijnhemel
Copy link

armijnhemel commented Apr 12, 2024

So what about file names that have a space in it, or perhaps have a semicolon? Those are valid file names and xgettext will happily process them and you will get something like this (I just renamed a file and then ran xgettext):

#: fdisk:big.c:234 fdisk:big.c:2942

I am not sure how common file names with semicolons are, but file names with spaces are not uncommon.

As far as I can tell partition() will not properly process this:

>>> _, _, bla = a.partition('#:')
>>> bla.partition(':')
(' fdisk', ':', 'big.c:234 fdisk:big.c:2942')

@armijnhemel
Copy link

So what about file names that have a space in it, or perhaps have a semicolon? Those are valid file names and xgettext will happily process them and you will get something like this (I just renamed a file and then ran xgettext):

#: fdisk:big.c:234 fdisk:big.c:2942

I am not sure how common file names with semicolons are, but file names with spaces are not uncommon.

As far as I can tell partition() will not properly process this:

>>> _, _, bla = a.partition('#:')
>>> bla.partition(':')
(' fdisk', ':', 'big.c:234 fdisk:big.c:2942')

thinking a bit more: the only certainty that you have is that the part after the last : should be a number, so perhaps rewriting it to rpartition() would be the cleanest way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants