Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fq lint output #39

Open
darked89 opened this issue Jul 31, 2024 · 2 comments
Open

fq lint output #39

darked89 opened this issue Jul 31, 2024 · 2 comments

Comments

@darked89
Copy link

Hi,

I have used fq lint on a bunch of files piping the output to a one text file.
Two issues:

  1. if the given fastq file had some issue/didn't validate, there is no line in the output on stdout. One can redirect stderror to the same/different file but after processing few hundreds fastq files I will rather parse the output which I already have. Which brings me to the next point
  2. the fq lint output is fine to read but bit tricky to parse. Contains some formatting chars but no file/path in each row. Getting 'file.foo 12345678or ratherfile_path number_of_reads` is not obtainable by simple grepping.

Hope it helps

Darek

@zaeleus
Copy link
Contributor

zaeleus commented Jul 31, 2024

fq lint does not provide an output. Its usage is meant to be either be a success or failure, signaled by the process's exit code. stdout only contains simple log messages of the command's execution.

Are you able to provide a more concrete example of what you're trying to achieve?

@darked89
Copy link
Author

darked89 commented Aug 1, 2024

Thank you for a really fast response.

I have two main goals:

  1. check that a given FASTQ file is correct
  2. since fq does output number of reads when FASTQ was validated, I want to get it.
2024-07-29T20:29:08.280922Z  INFO fq::commands::lint: fq-lint start
2024-07-29T20:29:08.371633Z  INFO fq::commands::lint: validating single end read
2024-07-29T20:29:08.371649Z  INFO fq::validators: disabled validators: []
2024-07-29T20:29:08.371659Z  INFO fq::validators: enabled single read validators: ["[S003] NameValidator", "[S004] CompleteValidator", "[S002] AlphabetValidator", "[S001] PlusLineValidator", "[S005] ConsistentSeqQualValidator", "[S006] QualityStringValidator"]
2024-07-29T20:29:08.371667Z  INFO fq::validators: enabled paired read validators: []
2024-07-29T20:29:08.371671Z  INFO fq::commands::lint: starting validation
2024-07-29T20:39:57.031928Z  INFO fq::commands::lint: read 48843609 records
2024-07-29T20:39:57.031963Z  INFO fq::commands::lint: fq-lint end

Instead of playing with existing output lines the best would be to have a final line:

RESULT file_path_or_name validation_passed 48843609

For the failed one (if possible):

RESULT failed_file_path_or_name failed 0_or_num_of_reads_before_fail

In both cases columns separated by TABs (easiest to read), no special chars to beautify that RESULT line of the output.
That way ingestion of the useful data would be trivial.

Many thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants