Skip to content

Commit

Permalink
Docs for options.cdx + options.cdxj should not turn off detectPages
Browse files Browse the repository at this point in the history
  • Loading branch information
matteocargnelutti committed Aug 15, 2024
1 parent 18c5f64 commit b857345
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 8 deletions.
10 changes: 5 additions & 5 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -290,20 +290,20 @@ export class WACZ {
this.detectPages = false
}

if (options?.indexFromWARCs === false) {
this.indexFromWARCs = false
}

if (options?.pages) {
this.detectPages = false
this.pagesDir = String(options?.pages).trim()
}

if (options?.cdxj) {
this.detectPages = false
this.indexFromWARCs = false // Added here for clarity, but implied by calls to `this.addCDXJ()`
this.cdxjDir = String(options?.cdxj).trim()
}

if (options?.indexFromWARCs === false) {
this.indexFromWARCs = false
}

if (options?.url) {
try {
new URL(options.url) // eslint-disable-line
Expand Down
7 changes: 4 additions & 3 deletions types.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,17 @@
* @typedef {Object} WACZOptions
* @property {string|string[]} input - Required. Path(s) to input .warc or .warc.gz file(s). Glob-compatible.
* @property {string} output - Required. Path to output .wacz file. Will default to PWD + `archive.wacz` if not provided.
* @property {boolean} [indexFromWARCs=true] - If true, will attempt to generate CDXJ indexes from processed WARCs. Automatically disabled if `addCDXJ()` is called.
* @property {boolean} [detectPages=true] - If true (default), will attempt to detect pages in WARC records. Automatically disabled if `pages` is provided or `addPages()` is called.
* @property {?string} pages - Path to a folder containing pages files (pages.jsonl, extraPages.jsonl ...).
* @property {boolean} [indexFromWARCs=true] - If true, will attempt to generate CDXJ indexes from processed WARCs. Automatically disabled if `cdjx` is passed or `addCDXJ()` is called.
* @property {boolean} [detectPages=true] - If true (default), will attempt to detect pages in WARC records to generate a pages.jsonl file. Automatically disabled if: `pages` is provided, or `addPages()` is called.
* @property {?string} url - If set, will be added to datapackage.json as `mainPageUrl`.
* @property {?string} ts - If set, will be added to datapackage.json as `mainPageDate`. Can be any value that `Date()` can parse.
* @property {?string} title - If set, will be added to datapackage.json as `title`.
* @property {?string} description - If set, will be added to datapackage.json as `description`.
* @property {?string} signingUrl - If set, will be used to try and sign the resulting archive.
* @property {?string} signingToken - Access token to be used in combination with `signingUrl`.
* @property {?Object} datapackageExtras - If set, will be appended to datapackage.json under `extras`.
* @property {?string} cdxj - If set, skips indexing and allows for passing CDXJ files "as is". Path to a folder containing CDXJ files. Allows
* @property {?string} pages - If set, allows for passing a pre-set pages.jsonl file. Path to a folder containing pages files (pages.jsonl, extraPages.jsonl ...).
* @property {?any} log - Will be used instead of the Console API for logging, if compatible (i.e: loglevel). Defaults to globalThis.console.
*/

Expand Down

0 comments on commit b857345

Please sign in to comment.