Skip to content

Mabofoul/step by step#4

Open
NoneSince wants to merge 38 commits into
Thomas-995:masterfrom
NoneSince:mabofoul/step_by_step
Open

Mabofoul/step by step#4
NoneSince wants to merge 38 commits into
Thomas-995:masterfrom
NoneSince:mabofoul/step_by_step

Conversation

@NoneSince

Copy link
Copy Markdown

the 5th commit (c78612c) is a revert to start again from your master. I did so to push again the changes I have done but make each step visible by giving it its own commit. now it is easier for you to check file comparisons.
regarding the utility function split() that I added, I didnt test it, but I ran the whole program on the examples and it worked.

NoneSince and others added 23 commits January 25, 2024 00:44
…s of header files, removed boost dependancy, removed/added spaces. included @xtrm-en work of PR #1, except his makefile (his makefile is better but i wont copy others work)
ordering CON_TOKENTYPE handlers/types like ordered in enum CON_TOKENTYPE
The reason: i want to push the changs one step at a time to make it easier to review.
src/*: whitespaces
src/construct.cpp: added a note you explained in the youtube video
README.md: added eexplaination for "-o" flag
warning 2: functions missing return statements
warning 3: for loop index is of type int but is compared to a value of type size_t
…t. also it is used in src/construct_flags.cpp so i added declaration.

con_type looks like the token types while also being only used in the file itself so i renamed to _con_type.
apply_macro_to_token() is used only in the file itself, so i set it as static and removed it from the header.
cstring is included for strcmp. so I replaced it with std::string comparison.
boost is used for split, is_any_of, and to_lower. so I implemented the functionality we need.
… the types and functions you directly use, and not depend on the current header files to complete the needed includes.

for example: the file src/construct_flags.h only needs to know the type std::string, so only #include <string>, and since the function implementation appears in src/construct_flags.cpp, it also needs an #include <string>.
you should put that same include in both .h and .cpp files, since technically the .h file can be empty, then it won't have includes for arguments types, the .cpp has to do the includes.
explaination for the importance of this rule: i remember something like this: <string> under one compiler / operating system does #include <algorithm> inside the header, so the program in the story ran filne without noticing the need of direct #include <algorithm>. once the same code was compiled with a different compiler / OS, it failed compilation since <algorithm> was not included.
…hrow error;".

before, an error could go unnoticed, but now the program will end automatically (I didn't do try-catch)
move empty line from top  of output file to bottom
…parser

fix parser breaking from whitespaces
windows adds exe to program name so makefile alywas recompiles
@NoneSince

Copy link
Copy Markdown
Author

Conversation regarding this PR is under PR #3

@Thomas-995 Thomas-995 left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try compiling and assembling the construct code before committing. In "strchr" findchr is macrod to sil because we cannot compare a single byte with a 64 bit register, which is why findchr is macrod to the 8 bit region of the register. This is also why when assembling it will give strchr.asm:10: error: mismatch in operand sizes

Thomas-995
Thomas-995 previously approved these changes Jan 31, 2024
@NoneSince

Copy link
Copy Markdown
Author

Sorry, I did not think of that! I will revert this change.

Also, about the "split takes alot of lines" issue, in a next PR im moving it down:
#include ...
...
static void to_lower(string& str);

class IsAnyOf {
private:
string chars;
public:
...
};

template
static void split(vector& result, const string& input, const Predicate& pred,
const bool& compress_adj_delims=true, const bool& strip=true);
...
implemenation of deconstruct functions...
...
// ----- ----- ----- ----- ----- ----- helper functions impl ----- ----- ----- ----- -----
implementation for split and stuff

@Thomas-995 Thomas-995 dismissed their stale review January 31, 2024 17:31

If the changes to the construct code can be removed, as well as the removal of boost I can merge it. (Sorry accidentlly approved already)

@Thomas-995

Copy link
Copy Markdown
Owner

Sorry, I did not think of that! I will revert this change.

Also, about the "split takes alot of lines" issue, in a next PR im moving it down: #include ... ... static void to_lower(string& str);

class IsAnyOf { private: string chars; public: ... };

template static void split(vector& result, const string& input, const Predicate& pred, const bool& compress_adj_delims=true, const bool& strip=true); ... implemenation of deconstruct functions... ... // ----- ----- ----- ----- ----- ----- helper functions impl ----- ----- ----- ----- ----- implementation for split and stuff

I think as long ONLY the required string functions are added and they're placed neatly so as not to cause confusion it might be okay to just fit it into the deconstruct file. I would like it if you did not make them template functions however. I'll take a look at the next PR.

@NoneSince

NoneSince commented Jan 31, 2024

Copy link
Copy Markdown
Author

also, i have just noted reg_to_str() is used only in reconstruct.cpp, so it should be static.
What is the code style you drive for? I for real suggest you use "my" style, because that's literally how everyone i have seen in person writes code, and the only devs with your style I saw online were JS developers. In JS there are arrow functions and many wrappers so you might want to collapse the brackets. And here, functions and classes are few so you pay no penalty for giving theie scope a special look.

@Thomas-995

Copy link
Copy Markdown
Owner

reg_to_str being static makes sense yeah since it will likely only ever be used to reconstruct code so I like that yes. My main issue with making functions that are only used in their respective source file static is that then we would have to do it for all of them, but if you are up to that I don't mind. Also I am aware that this isn't the usual style, I do not have this habit from js haha. I just think having a curly brace on its own seperate line makes sense when closing a function, as it's the only indicator (besides indentation) that a function has ended. But for opening a function you already have the function decleration as a signal that a new function has been opened. It's just preference but I see no reason to change it for everything.

@Thomas-995

Copy link
Copy Markdown
Owner

Also I really like how you did the string functions here. Instead of cluttering the header file which is supposed to be a nice overview of functionality, you just quickly declared them at the start of the cpp file. I'll look over it properly tommorow, but this is good.

@Thomas-995

Copy link
Copy Markdown
Owner

Though why wouldn't you just make the string split function return a pointer to the result?

@NoneSince

Copy link
Copy Markdown
Author

Though why wouldn't you just make the string split function return a pointer to the result?

I did that because I was mimicking boost::split, and I didnt think of doing it as you say.
I will define the vector inside the function, and return a copy.
note that there is no need for pointers, vector inside the function will be copied outside and then deleted, so you can return a vector instead of vector*.
if you meant the function receives a vector and returns a pointer to it, then that's meaningless. simply get a reference and write inside it (exactly like now)

about code style, i will keep up with your style, since you started all of that. If I were the owner, I would run my style (also, indentation is 4 spaces), so now I cede this matter to you, with full acceptance in-sha-allah.

NoneSince and others added 7 commits January 31, 2024 20:44
strchr.con: reverted wrong change - findchr is 8bit while chr is 64bit, so they are different
reg_to_str: changed back to use bitwidth as argument, and made static
split: better args
use syscalls like functions, and pass arguments to function/syscall correctly
@NoneSince NoneSince mentioned this pull request Feb 2, 2024
@NoneSince

Copy link
Copy Markdown
Author

I see a problem in parsing commands with memory addresses, like "mov eax, dword ptr [ebx]".
I split a line by spaces and comma, then take first 3 words. I will try to fix that by splitting by comma first, and all what is to right will be put together, and the words to left will be the command itself and arg1.
And an extra check is to see if there are brackets to denote memory address, and if both arg1 and arg2 are memory, its a syntax error (in assembly, not that we wanna prevent for convenience)

@NoneSince NoneSince requested a review from Thomas-995 February 2, 2024 06:25
@NoneSince

Copy link
Copy Markdown
Author

another thing: why do you do to_lower() to every line you parse? it makes strlwr.con example store the string already in lower case, which proves the action is wrong.
strlwr.con: teststring db "HeLlO WoRlD", 0
strlwr.asm: teststring db "hello world", 0

also, there is no colon after "teststring". isn't that wrong?

NoneSince and others added 3 commits February 2, 2024 10:11
src/construct.cpp: free allocated memory
src/deconstruct.cpp: remove to_lower() function and usage
src/reconstruct.cpp: same apply-macro logic but better code
examples/*.con: using macros for magic numbers
examples/*.asm: showing the result of the example inputs
@NoneSince

Copy link
Copy Markdown
Author

I see a problem in parsing commands with memory addresses, like "mov eax, dword ptr [ebx]". I split a line by spaces and comma, then take first 3 words. I will try to fix that by splitting by comma first, and all what is to right will be put together, and the words to left will be the command itself and arg1. And an extra check is to see if there are brackets to denote memory address, and if both arg1 and arg2 are memory, its a syntax error (in assembly, not that we wanna prevent for convenience)

I'm unable to commit on this PR, because I have other commits locally and pushing them means you will review them and delay merging this branch.
So, i will fix these after you merge.

@Thomas-995

Copy link
Copy Markdown
Owner

Pulled and compiled the branch, running construct on the example strlwr.con gives a SIGSEGV (Address boundary error). And the name of the compiled executable is construct.exe, it should be dependant on the OS. (Linux executables don't have an extension.)

@NoneSince

Copy link
Copy Markdown
Author

regarding construct.exe, windows must have the .exe even if you dont write any extention, while linux isn't affected with the extension, so i did put the extention. you can roll with it and fix that later if you know (it is not a "fix" since there is no problem, you just prefer the name under linux be "construct") .
about the error, I don't know, but since them i have fixed some stuff and bugs that I added. i think i will do a PR that only changes style (no logic) for you to quickly test and push, because I have alot of free time so I tamper with the code and grow further each day

@NoneSince

Copy link
Copy Markdown
Author

now everything works: i ran your original master on the 3 examples, and ran my PR on these examples, and compared the output files (didnt run them, and need to). the files are the same, which means that my code is a step forward.
regarding ".exe" extention, keep it and merge the PR, because this reaaaally drives us back to freeze all the PR work for a further notice. as i said, im building over it

@NoneSince

Copy link
Copy Markdown
Author

@Thomas-de-Bock please see the last update

@Thomas-995

Thomas-995 commented Feb 5, 2024

Copy link
Copy Markdown
Owner

I'll take a look at it tomorrow. Also for the exe extension thing, that's not up for debate. It only had no file extension before because the build was never meant for windows. The exe extension is exclusive to windows executables and linux binaries have no extension, this is not my preference it is the general consensus. Compiling a program currently exclusively made for use on Linux to a file with the windows executable extension is very much a problem. I'll merge it if that's fixed and everything else works.

@NoneSince

Copy link
Copy Markdown
Author

"I'll merge it if that's fixed"... why not merge it then remove the .exe?
also as I said, I don't run the output asm file, I only compare it to what was produced before, so even if the out asm is broken, my PR is still fine and ready for merge, since I achieved the current state in a better way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants