-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy pathcsvcleaner.1.html
142 lines (140 loc) · 3.84 KB
/
csvcleaner.1.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
<!DOCTYPE html>
<html>
<head>
<title>Caltech Library's Digital Library Development Sandbox</title>
<link href='https://fonts.googleapis.com/css?family=Open+Sans' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="/css/site.css">
</head>
<body>
<header>
<a href="http://library.caltech.edu"><img src="/assets/liblogo.gif" alt="Caltech Library logo"></a>
</header>
<nav>
<ul>
<li><a href="/">Home</a></li>
<li><a href="./">README</a></li>
<li><a href="LICENSE">LICENSE</a></li>
<li><a href="INSTALL.html">INSTALL</a></li>
<li><a href="user-manual.html">User Manual</a></li>
<li><a href="how-to/">Tutorials</a></li>
<li><a href="search.html">Search Docs</a></li>
<li><a href="about.html">About</a></li>
<li><a href="https://github.com/caltechlibrary/datatools">GitHub</a></li>
</ul>
</nav>
<section>
<h1 id="name">NAME</h1>
<p>csvcleaner</p>
<h1 id="synopsis">SYNOPSIS</h1>
<p>csvcleaner <a href="#options">OPTIONS</a></p>
<h1 id="description">DESCRIPTION</h1>
<p>csvcleaner normalizes a CSV file based on the options selected. It
helps to address issues like variable number of columns,
leading/trailing spaces in columns, and non-UTF-8 encoding issues.</p>
<p>By default input is expected from standard in and output is sent to
standard out (errors to standard error). These can be modified by
appropriate options. The csv file is processed as a stream of rows so
minimal memory is used to operate on the file.</p>
<h1 id="options">OPTIONS</h1>
<dl>
<dt>-help</dt>
<dd>
display help
</dd>
<dt>-license</dt>
<dd>
display license
</dd>
<dt>-version</dt>
<dd>
display version
</dd>
<dt>-verbose</dt>
<dd>
write verbose output to standard error
</dd>
<dt>-comma</dt>
<dd>
if set use this character in place of a comma for delimiting cells
</dd>
<dt>-comment-char</dt>
<dd>
if set, rows starting with this character will be ignored as comments
</dd>
<dt>-fields-per-row</dt>
<dd>
set the number of columns to output right padding empty cells as needed
</dd>
<dt>-i, -input</dt>
<dd>
input filename
</dd>
<dt>-left-trim</dt>
<dd>
left trim spaces on CSV out
</dd>
<dt>-o, -output</dt>
<dd>
output filename
</dd>
<dt>-output-comma</dt>
<dd>
if set use this character in place of a comma for delimiting output
cells
</dd>
<dt>-quiet</dt>
<dd>
suppress error messages
</dd>
<dt>-reuse</dt>
<dd>
if false then a new array is allocated for each row processed, if true
the array gets reused
</dd>
<dt>-right-trim</dt>
<dd>
right trim spaces on CSV out
</dd>
<dt>-stop-on-error</dt>
<dd>
exit on error, useful if you’re trying to debug a problematic CSV file
</dd>
<dt>-trim, -trim-spaces</dt>
<dd>
trim spaces on CSV out
</dd>
<dt>-trim-leading-space</dt>
<dd>
trim leading space from field(s) for CSV input
</dd>
<dt>-use-crlf</dt>
<dd>
if set use a charage return and line feed in output
</dd>
<dt>-use-lazy-quotes</dt>
<dd>
use lazy quotes for CSV input
</dd>
</dl>
<h1 id="examples">EXAMPLES</h1>
<p>Normalizing a spread sheet’s column count to 5 padding columns as
needed per row.</p>
<pre><code> cat mysheet.csv | csvcleaner -field-per-row=5</code></pre>
<p>Trim leading spaces from output.</p>
<pre><code> cat mysheet.csv | csvcleaner -left-trim</code></pre>
<p>Trim trailing spaces from output.</p>
<pre><code> cat mysheet.csv | csvcleaner -right-trim</code></pre>
<p>Trim leading and trailing spaces from output.</p>
<pre><code> cat mysheet.csv | csvcleaner -trim-space</code></pre>
<p>csvcleaner 1.2.12</p>
</section>
<footer>
<span><h1><A href="http://caltech.edu">Caltech</a></h1></span>
<span>© 2023 <a href="https://www.library.caltech.edu/copyright">Caltech library</a></span>
<address>1200 E California Blvd, Mail Code 1-32, Pasadena, CA 91125-3200</address>
<span>Phone: <a href="tel:+1-626-395-3405">(626)395-3405</a></span>
<span><a href="mailto:[email protected]">Email Us</a></span>
<a class="cl-hide" href="sitemap.xml">Site Map</a>
</footer>
</body>
</html>