Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] change format of dwalk text output #555

Open
markmoe19 opened this issue Sep 11, 2023 · 4 comments
Open

[Request] change format of dwalk text output #555

markmoe19 opened this issue Sep 11, 2023 · 4 comments

Comments

@markmoe19
Copy link

would it be possible to change the format of the dwalk text output file? [related to request about adding access time]
for example, for my post-processing purposes it would be great if file modify and access time were reported as integer seconds (epoch time). Thanks!

@markmoe19
Copy link
Author

@adammoody suggested a possible printf() format string could be passed to dwalk (or dfind) to format text output something like + does for date command. That would be great! Currently, my main use-case requires both atime and mtime for each file's path given in the text output, ideally in %s (epoch seconds) format. File size in bytes would be great too. Thanks!

@adammoody
Copy link
Member

adammoody commented Sep 29, 2023

Until we have the more general solution, which will likely be a while, it's probably easiest to hack the existing format to suit your needs. You'd want to modify the lines in src/common/mfu_flist_io.c here:

numbytes = snprintf(buffer, bufsize, "%s %s %s %7.3f %3s %s %s\n",
mode_format, username, groupname,
size_tmp, size_units, modify_s, file
);

For example, the following patch:

diff --git a/src/common/mfu_flist_io.c b/src/common/mfu_flist_io.c
index 0b0a2e5..285afd5 100644
--- a/src/common/mfu_flist_io.c
+++ b/src/common/mfu_flist_io.c
@@ -22,6 +22,9 @@
 #include <errno.h>
 #include <string.h>
 
+/* define PRI64 */
+#include <inttypes.h>
+
 #include "dtcmp.h"
 #include "mfu.h"
 #include "mfu_flist_internal.h"
@@ -1640,9 +1643,9 @@ static size_t print_file_text(mfu_flist flist, uint64_t idx, char* buffer, size_
         const char* size_units;
         mfu_format_bytes(size, &size_tmp, &size_units);
 
-        numbytes = snprintf(buffer, bufsize, "%s %s %s %7.3f %3s %s %s\n",
+        numbytes = snprintf(buffer, bufsize, "%s %s %s %7.3f %3s %" PRIu64 " %" PRIu64 " %" PRIu64 " %s\n",
             mode_format, username, groupname,
-            size_tmp, size_units, modify_s, file
+            size_tmp, size_units, size, acc, mod, file
         );
     }
     else {

changes dwalk --text --output list.txt /path lines to print file size, atime, mtime as integers immediately following the human readable file size still shown in floating point with units. So rather than the current format of:

drwxrwx--- user1 user1   4.000 KiB Sep 22 2023 17:08 /path
-rw------- user1 user1 854.000   B Sep 22 2023 17:08 /path/CMakeLists.txt
drwx------ user1 user1   4.000 KiB Sep 22 2023 17:08 /path/daos-serialize

it prints as:

drwxrwx--- user1 user1   4.000 KiB 4096 1696015833 1695427689 /path
-rw------- user1 user1 854.000   B 854 1696016217 1695427689 /path/CMakeLists.txt
drwx------ user1 user1   4.000 KiB 4096 1696016421 1695427689 /path/daos-serialize

@markmoe19
Copy link
Author

That worked great! I also took out the human readable size and commented out the lines to format the size and times. Speeds up the text file output time. Final size of text file is about the same. Best part, my post processing now has atime information AND is ~5x faster! Thank you! :)

@adammoody
Copy link
Member

Great! Wow, it's surprising that the string formatting adds so much overhead, but that's also good to know about. Good idea to try that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants