Skip to content

Provide JSON output option#277

Closed
niklasmohrin wants to merge 2 commits into
bee-san:masterfrom
niklasmohrin:json-output
Closed

Provide JSON output option#277
niklasmohrin wants to merge 2 commits into
bee-san:masterfrom
niklasmohrin:json-output

Conversation

@niklasmohrin
Copy link
Copy Markdown
Contributor

Hey! This should close #213.

The solution was pretty straightforward. I was wondering whether it would be nicer to have a flag that outputs the results as JSON to the console so they could be piped into another process or get redirected to a file. I also left a TODO at the merge_required part, because I was not sure whether I had to add something there.

Let me know what you think!

@bernardoamc
Copy link
Copy Markdown
Contributor

I was wondering whether it would be nicer to have a flag that outputs the results as JSON to the console so they could be piped into another process or get redirected to a file.

We could introduce a flag called --format that would accept json or text for now and maybe another one for --output which would be stdout for default or a file. What do you think?

Having the format first would allow us to pipe the output to another process or redirect to a file as you mentioned. <3

I also left a TODO at the merge_required part, because I was not sure whether I had to add something there.

That's my least favourite part of the code, but I couldn't find a better solution at the time to merge the options from the config file and the options passed on the command line directly.

The flow to add a new thing to the config is:

  1. Add option to the Config struct
  2. If the new flag is Optional (without a default), add it to the merge_optional! call.
  3. Otherwise add it to the merge_required! call.

@bee-san
Copy link
Copy Markdown
Owner

bee-san commented Oct 16, 2020

I was wondering whether it would be nicer to have a flag that outputs the results as JSON to the console so they could be piped into another process or get redirected to a file.

We could introduce a flag called --format that would accept json or text for now and maybe another one for --output which would be stdout for default or a file. What do you think?

Having the format first would allow us to pipe the output to another process or redirect to a file as you mentioned. <3

I also left a TODO at the merge_required part, because I was not sure whether I had to add something there.

That's my least favourite part of the code, but I couldn't find a better solution at the time to merge the options from the config file and the options passed on the command line directly.

The flow to add a new thing to the config is:

1. Add option to the Config struct

2. If the new flag is Optional (without a default), add it to the `merge_optional!` call.

3. Otherwise add it to the ` merge_required!` call.

Just to reiterate, that last part is a problem on our side -- we do not have documentation on adding a new thing to the config so sorry for any confusion :) I'll add it in now ✨

Update, added them here:
https://github.com/RustScan/RustScan/wiki/Adding-new-options---flags

@niklasmohrin
Copy link
Copy Markdown
Contributor Author

Alright, this is what I came up with: There is the --format flag that can (as of now) be text or json. Additionally, there is the --output-file flag which can be set to have the formatted output written to a file instead of to the console. I (for now) decided that if the format flag is set to text (the default) and the output has not been set to a file (and nothing else signals that the user cares about the formatted output), nothing is printed. This might be confusing, if you specifically write --format text and don't get any output, so I might wrap this flag into an Option and implement default (so that unwrap_or_default() can be used). Let me know what you think.

I also moved the text output that was in the Nmap loop out of there. While doing so, I found Itertools::join and since you already have itertools as a dependency I used that and also replaced the old joining in the loop.

Lastly, I updated the output of rustscan -h in README.md and the option addresses moved, but this is correct and has just been forgotten when that flag changed.

@bee-san
Copy link
Copy Markdown
Owner

bee-san commented Oct 16, 2020

(for now) decided that if the format flag is set to text (the default) and the output has not been set to a file (and nothing else signals that the user cares about the formatted output), nothing is printed

This sounds rather confusing, I would hope that if the format is text and it's not outputting to a file, it outputs to the terminal?

Else, we make them mutually exclusive in the code itself, so if format == text and not file_output stop program. Although, from my perspective, if the default is text we shouldn't panic if our input is the same as the default

What do you think? :)

@niklasmohrin
Copy link
Copy Markdown
Contributor Author

niklasmohrin commented Oct 16, 2020

@bee-san Yeah, it's rather unintuitive. Things like Open 80 are still output to the terminal currently, only the summary format is not getting printed when running rustscan --format text 1.2.3.4.

To make sure we're on the same page here: Is the "text" / greppable / summary / list format (for example 1.2.3.4 -> [80,42]) some standard format or was it just put together as a nice output in this program? Maybe it should be renamed to "greppable", "summary" or "list".

Here is my next proposal: The greppable flag enables the formatted summary (like it does right now) additionally to muting other output. This would mean that --format determines how and if the summary is output, and --greppable hides all output except for the summary and also implies --format greppable (or --format text) if nothing else is specified. Similarly, --output-file some/file would also imply --format greppable if not specified differently.

Anyway, I would implement Default for OutputFormat with greppable as the default and make the field in Opts a Option<OutputFormat>, so that in main, there would be something like

fn main() {
  // ...
  if opts.greppable || opts.format.is_some() || opts.output_file.is_some() || ... {
    match opts.format.unwrap_or_default() {
      // ...
    }
  }
}

Note: The more I think about it, I wonder if --format should rather be called --summary, what do you think?

@bernardoamc
Copy link
Copy Markdown
Contributor

My two cents on this topic:

I would like to simplify our public APIs otherwise we start running into confusing scenarios.

The challenge here seems to to be having an useful output in our generated file, which is the scanner result in this case versus having all the output of RustScan in the file.

In my opinion we shouldn't really output to a file the results of our scripts, we can let scripts themselves do that or the user can pipe stdout to a file.

@bee-san @niklasmohrin What do you think of:

  1. Dropping the greppable option entirely since we can already disable nmap with no_nmap which yield similar results.
  2. Introduce --summary [text/json/etc] that outputs only the results of our scanner in this format
  3. Introduce --output [stdout/file path] that prints the results of our scanner to stdout or to the specified file.

@niklasmohrin
Copy link
Copy Markdown
Contributor Author

@bee-san @bernardoamc I think we are all thinking of a similar solution. To clarify: Should the summary output and the nmap scan be mutually exclusive, that is, should the --summary be basically one mode that generates the summary in contrast to say "normal mode" that pipes to Nmap (react 👍)? Or should it be an additional feature that also outputs the summary on the side, possibly writing it to a file before proceeding with the nmap scan (react 👎)?

I agree that this whole --output-file thing is not really good unix-style. If "summary mode" was a thing, I would get rid of the flag and instead rely on output redirection, like you said. To make that convenient, all output that does not belong to the summary could be moved to stderr. When running rustscan --summary json ... > out.json, you would only have the summary in the file and still have all the intermediate output, especially including the Open 80 notifications. Furthermore, you run rustscan --summary json ... 2> /dev/null to only get the summary and hide the rest of the output (which would also make --greppable obsolete, as you could do rustscan --summary greppable 2> /dev/null).

Then however, there should be something like a --no-stderr flag that moves all the normal stuff like the logo back to stdout, because some shells introduce other coloring on stderr and you might just want to run rustscan normally. It could also be the other way around, having a --split-output flag that introduces described splitting into stdout and stderr. Finally, one could also decide that based on whether --summary was specified.

On the other hand, if --summary should be a side thing that is output before the nmap scan (that will run), I would keep --output-file like it is right now.

I hope I did not confuse too much stuff, but I personally think that this distinction between say "normal mode" and "summary mode" would be a clean solution.

P.S. Maybe there should be a flag to have a JSON file as an input and take that as the result of the port scan. This could be nice if you wanted to try different nmap arguments against the same target. This can be discussed later or in a different thread though.

@niklasmohrin niklasmohrin marked this pull request as draft October 19, 2020 20:14
@niklasmohrin
Copy link
Copy Markdown
Contributor Author

@bee-san @bernardoamc You should be able to compare both approaches from my last comment by using git checkout for either of the two commits. The first one is what I originally came up with and the more recent one is the one from my last comment. Let me know which one you like better!

@niklasmohrin niklasmohrin mentioned this pull request Oct 29, 2020
@niklasmohrin
Copy link
Copy Markdown
Contributor Author

Since there is also an issue about having only the output from nmap and nothing else (see #296), this "mode" semantic could be further expanded:

  1. Default mode: The current behavior, that is: Logo, opening, open port output and nmap output
  2. Summary / Rustscan / portscan mode: Only output on stdout is the summary and intermediate output on stderr can be enabled or disabled with a flag
  3. Nmap / Pipe mode: No output from rustscan is on stdout, only what nmap outputs. Intermediate output on stderr may be enabled (+ maybe some smart output merging / structuring when multiple IPs are scanned? @iiiusky)

I am not sure what the best approach to encoding these modes in CLI parameters would be (subcommands or mutually exclusive flags?). I think that this would make for a great experience with rustscan, since you can access all possible use cases of the program easily through the choice of a mode.

Let me know what you think @bee-san @bernardoamc @iiiusky

@iiiusky
Copy link
Copy Markdown

iiiusky commented Oct 30, 2020

Since there is also an issue about having only the output from nmap and nothing else (see #296), this "mode" semantic could be further expanded:

  1. Default mode: The current behavior, that is: Logo, opening, open port output and nmap output
  2. Summary / Rustscan / portscan mode: Only output on stdout is the summary and intermediate output on stderr can be enabled or disabled with a flag
  3. Nmap / Pipe mode: No output from rustscan is on stdout, only what nmap outputs. Intermediate output on stderr may be enabled (+ maybe some smart output merging / structuring when multiple IPs are scanned? @iiiusky)

I am not sure what the best approach to encoding these modes in CLI parameters would be (subcommands or mutually exclusive flags?). I think that this would make for a great experience with rustscan, since you can access all possible use cases of the program easily through the choice of a mode.

Let me know what you think @bee-san @bernardoamc @iiiusky

I think the third mode can meet most of the needs, because the format that comes with nmap can be used. It is already a standard format. Many developers are already compatible with the parsing of nmap xml format. Of course, there is another situation. Only rustscan is used, and nmap scanning is not performed.In this case, if you need to output, you can consider customizing an output structure.

@niklasmohrin
Copy link
Copy Markdown
Contributor Author

The problem is that we call nmap multiple times, once for every host, because there are different ports to scan on every host. Thus we have multiple XML files with different entries in the <scaninfo> tag (where the ports / services scanned are listed). I saw that nmap can run against multiple hosts, but afaik you can only specify the set of ports to scan once for all targets. I don't think we want to lose performance having nmap scan a port that is knowingly not open on a target but on another, which is why there has to be some smart merging of the XML outputs if there are multiple IPs or CIDRs.

# Let's say we found 10.10.10.21 to have ports 80, 81
# and 10.10.10.22 to have 80, 82 open
# Afaik the only way to have all of the open ports scanned
# in one nmap run is to scan all three ports on both machines
nmap -p 80,81,82 10.10.10.21 10.10.10.22
# which would be slightly slower for this and probably
# way slower for bigger examples

# We need to merge the outputs of
nmap -p 80,81 10.10.10.21
nmap -p 80,02 10.10.10.22
# to look as if we run the command
# from the above explanation

I think it would be reasonable to modify the XML content as if the nmap scan ran against all ports that we give to nmap at all (so it's the set of all ports found to be open by the port scan), because we know that the port is closed and nmap would have determined the same.

I totally agree that we should not introduce a totally new format for nmap results, that would be very inconvenient. @iiiusky What do you think would be the best way to accomplish that? Maybe it's just some serde magic around the nmap xml?

@iiiusky
Copy link
Copy Markdown

iiiusky commented Oct 30, 2020

The problem is that we call nmap multiple times, once for every host, because there are different ports to scan on every host. Thus we have multiple XML files with different entries in the <scaninfo> tag (where the ports / services scanned are listed). I saw that nmap can run against multiple hosts, but afaik you can only specify the set of ports to scan once for all targets. I don't think we want to lose performance having nmap scan a port that is knowingly not open on a target but on another, which is why there has to be some smart merging of the XML outputs if there are multiple IPs or CIDRs.

# Let's say we found 10.10.10.21 to have ports 80, 81
# and 10.10.10.22 to have 80, 82 open
# Afaik the only way to have all of the open ports scanned
# in one nmap run is to scan all three ports on both machines
nmap -p 80,81,82 10.10.10.21 10.10.10.22
# which would be slightly slower for this and probably
# way slower for bigger examples

# We need to merge the outputs of
nmap -p 80,81 10.10.10.21
nmap -p 80,02 10.10.10.22
# to look as if we run the command
# from the above explanation

I think it would be reasonable to modify the XML content as if the nmap scan ran against all ports that we give to nmap at all (so it's the set of all ports found to be open by the port scan), because we know that the port is closed and nmap would have determined the same.

I totally agree that we should not introduce a totally new format for nmap results, that would be very inconvenient. @iiiusky What do you think would be the best way to accomplish that? Maybe it's just some serde magic around the nmap xml?

I think that using a method like nmap -p 80,81,82 10.10.10.21 10.10.10.22 may be the best solution.If it is merged, it may cause many unknown problems, and using nmap is slower.

@niklasmohrin
Copy link
Copy Markdown
Contributor Author

@iiiusky I don't think this is a sane default because it would be against what rustscan tries to accomplish: unnecessary slow nmap port scans. Let' say you want to scan 192.168.178.0/24 where maybe one server has 10 ports open and most other clients have few ports open, then you would have scaled the number of scans performed by nmap by the number of clients.

Maybe this should be an option like --stable-xml that produces the XML from nmap this way while giving up on some performance (in the above example, rustscan should still be a lot faster than just nmap), but I don't think this should be on by default.

Just output all XML files without merging them (maybe capture output and have rustscan create a file with a helpful name?) or merge them similar to for example https://github.com/CBHue/nMap_Merger (haven't tested, just saw that this is out here).

@iiiusky
Copy link
Copy Markdown

iiiusky commented Oct 30, 2020

@iiiusky I don't think this is a sane default because it would be against what rustscan tries to accomplish: unnecessary slow nmap port scans. Let' say you want to scan 192.168.178.0/24 where maybe one server has 10 ports open and most other clients have few ports open, then you would have scaled the number of scans performed by nmap by the number of clients.

Maybe this should be an option like --stable-xml that produces the XML from nmap this way while giving up on some performance (in the above example, rustscan should still be a lot faster than just nmap), but I don't think this should be on by default.

Just output all XML files without merging them (maybe capture output and have rustscan create a file with a helpful name?) or merge them similar to for example https://github.com/CBHue/nMap_Merger (haven't tested, just saw that this is out here).

I don't agree with this method of not merging nmap results and then outputting all the results. This will make the process of parsing the resulting xml very cumbersome. If you want to merge xml, this method is definitely feasible, but there may be unknown problems.

@niklasmohrin
Copy link
Copy Markdown
Contributor Author

Yes, just outputting all XML files separately would probably be a bit messy at times, but it might be the best option in some situations, which is why I wouldn't want to turn the option down so quickly. There is probably a use-case for all three approaches:

  • If you scan a lot of clients and need the speed, but you also have to be sure that the XML is untouched, this method would probably be best.
  • If you definitely need the untouched merged nmap XML, you use what I called --stable-xml above.
  • If none of the above applies, you probably want the XML that is merged by rustscan code for convenience. This would probably be fine in most cases, right? What super terrible errors could occur? @iiiusky You seem to have more experience on this, what is a typical usecase of the XML output?

@iiiusky
Copy link
Copy Markdown

iiiusky commented Oct 31, 2020

Yes, just outputting all XML files separately would probably be a bit messy at times, but it might be the best option in some situations, which is why I wouldn't want to turn the option down so quickly. There is probably a use-case for all three approaches:

  • If you scan a lot of clients and need the speed, but you also have to be sure that the XML is untouched, this method would probably be best.
  • If you definitely need the untouched merged nmap XML, you use what I called --stable-xml above.
  • If none of the above applies, you probably want the XML that is merged by rustscan code for convenience. This would probably be fine in most cases, right? What super terrible errors could occur? @iiiusky You seem to have more experience on this, what is a typical usecase of the XML output?

You can take a look at the introduction of the --resume parameter in the help file of the official website. Also, if the results of nmap are merged, do we need to traverse all nodes and merge? Or is it merged in truncated mode? If it is to traverse all nodes and merge, you need to follow the official nmap.dtd https://github.com/nmap/nmap/commits/master/docs/nmap.dtd definition to keep the latest state.

@bernardoamc
Copy link
Copy Markdown
Contributor

Sorry it took so long to answer @niklasmohrin , life happened. I will take a look at your approach. 🙏

@iiiusky
Copy link
Copy Markdown

iiiusky commented Nov 24, 2020

Yes, just outputting all XML files separately would probably be a bit messy at times, but it might be the best option in some situations, which is why I wouldn't want to turn the option down so quickly. There is probably a use-case for all three approaches:

  • If you scan a lot of clients and need the speed, but you also have to be sure that the XML is untouched, this method would probably be best.
  • If you definitely need the untouched merged nmap XML, you use what I called --stable-xml above.
  • If none of the above applies, you probably want the XML that is merged by rustscan code for convenience. This would probably be fine in most cases, right? What super terrible errors could occur? @iiiusky You seem to have more experience on this, what is a typical usecase of the XML output?

I have a new idea. We can scan different ports for each different host according to what you said above, and then return it, but after the final return, we can spit out a json type array format, and then nmap The scanned results are base64 encoded. If you need to parse the nmap results, you must first parse our json list. The json list contains all the results scanned by nmap, such as this

{
[nmap xml result base64],
[nmap xml result base64],
……
}

What do you think of this? I think this is not very friendly to programs that directly parse the results of nmap, because he must add a new intermediate parsing tool, but programs that use rustscan for secondary development have no such troubles.

@iiiusky
Copy link
Copy Markdown

iiiusky commented Nov 26, 2020

I implemented the functions I wanted above, but did not implement xml merge,@niklasmohrin
link

@niklasmohrin
Copy link
Copy Markdown
Contributor Author

@iiiusky I don't think I understand why base 64 encoded XML in a JSON array is better than just listing all XML outputs with some separator or even output XML that is just a list of all of the nmap XML outputs. Isn't this even more complicated for other programs to parse?


I feel like this is all getting very very complicated and I would rather discuss the scope of all of these changes before working on implementation (again). What should the user API for all of these options look like? There could be the mode approach that I described in this comment or the output options for just the port scan could also be scripts for the scripting engine, removing the need of modes. Please give me some input / opinions @bee-san @bernardoamc

@iiiusky
Copy link
Copy Markdown

iiiusky commented Nov 27, 2020

I consider using base64 to encode xml in json, because there may be some character encoding problems in the process of outputting xml. After encoding it once with base64, the original data can be completely output (I may be worrying too much
, this kind of problem will not occur at all?).

My process of using rustscan is as follows:

  1. The master node calls the rustscan work node
  2. After scanning the results inside rustscan, use nmap scan in rustscan
  3. Get all output results after running
  4. Analyze the results
    5:
    5-1. Store analysis results
    5-2. Call other programs, use the port that the program would use to schedule the next tasks.
    6.………………

In the process of using rustscan, I need to get the original nmap output, but according to the existing situation, if I pass the -oX parameter to nmap, the output information of a single host is good, if there are multiple hosts output then all of them They are displayed together, and the tool cannot distinguish them directly, and must rely on line breaks or keywords.

When I have an appeal problem, can the script solve the situation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide a JSON output option

4 participants