diff --git a/README.md b/README.md index 168ec8e2c..23debb6cf 100644 --- a/README.md +++ b/README.md @@ -1,19 +1,91 @@ -# RetroArch Database +# Libretro Database -RetroArch incoporates a ROM scanning system to automatically produce playlists. Each ROM that is scanned by the playlist generator is checked against a database of ROMs that are known to be good copies. +The github repository for databases used by RetroArch. -## Contents +## RetroArch's Usage of the Database -- [`cht`](cht) Cheats to various games -- [`cursors`](cursors) Provides methods in order to query the playlists -- [`dat`](dat) Customized DAT files, maintained by the libretro team -- [`metadat`](metadat) Different metadata and third-party DATs available to the systems +Libretro databases allow RetroArch to provide several automated cataloging functions: + +- __Validation__. Reject or accept files when using the [Import Scanner / Playlist Generator](https://docs.libretro.com/guides/roms-playlists-thumbnails/#working-with-playlists) based on whether the ROM checksum matches the checksum of a known verified completely intact (aka "properly dumped") file. +- __Game Naming__. Assign a definitive and uniform display name for each game in a playlist regardless of filename. +- __Thumbnail Images__. Download and display thumbnail images for games based on the uniform name assigned by the database, regardless of filename. (Thumbnails are __not__ directly assigned by the database or by checksum association, but as a secondary effect of databased *game name* assignment if a matching thumbnail is available on the server. Also see: [Flexible Name Matching Algorithm](https://docs.libretro.com/guides/roms-playlists-thumbnails/#custom-thumbnails).) +- __Category Search ("Explore")__. Allows the user to find/view games that match selected criteria, e.g. by Developer, Release Year, Genre, and other attributes/metadata. +- __Per-Game Information View__. Provide an in-app viewable informational screen for each game (Game > Information > Database Entry). + +## Repository Contents + +### File Types + +- __Game information database files__. + - __`.dat`__ files in the clrmamepro DAT format, from many [sources](#sources) and across many categories of metadata. The system of dats is multifaceted: alternative or additional sources can be easily added and maintained in a self-contained constituent, some dats may overlap in the games they cover (see [precedence](#precedence)), and some dats cover an exclusive niche of games or attributes. + - __`.rdb`__ files used by RetroArch, compiled and amalgamated from the `.dat` files. [RetroArch Database format](https://github.com/libretro/RetroArch/blob/master/libretro-db/README.md) (_no relation to Redis .RDB files_) accommodates RetroArch's [wide range of hardware/OS platforms](https://www.retroarch.com/index.php?page=platforms). +- __`.cht` cheat code files__. These are game-specific, remain in plain text, and are used as-is by RetroArch if manually selected by the user (see [Cheat Code Documentation](https://docs.libretro.com/guides/cheat-codes/)). Cheat codes are collected from any available source on the web including by manual [contributions from users](https://github.com/libretro/libretro-database/pulls?q=is%3Apr+is%3Aclosed+cheats) who have used RetroArch's built-in [memory address/value search feature](https://docs.libretro.com/guides/cheat-codes/#retroarch-new-cheat-code-searching) to construct new cheat codes. The repository contains one folder for each system (unlike dats), and multiple different cheat files may exist for the same game. +- __Admin/management scripts__ and files. + +### Folder Guide + +The non-exhaustive list below serves as a guide to various folders in the repository. + +- [`cht`](cht) Cheat codes. +- [`cursors`](cursors) Methods to query playlists. +- [`dat`](dat) Customized DAT files maintained by the libretro team, including: + - Subset data coverage for games or variants that do/did not have contemporary documentation by upstream database groups, e.g. Virtual Console variants of SNES games, fan translations of NEC PC-98 games, and a superceded squib for PSP Minis. + - Game data for monolithic non-generalized cores, e.g. Cave Story, Doom, Quake, etc. + - Data adapted from upstream sources that cover a relatively small number of systems and can therefore can be housed together in a single repository folder without conflict, e.g. DOS, ScummVM, and GameTDB coverage of GameCube and Wii data. (Though many dats from upstream groups reside in [`metadat`](metadat).) +- [`metadat`](metadat) Several principal third-party DATs (e.g. No-Intro, Redump, MAME, TOSEC) that each cover a large number of systems and therefore require their own folders in the repository, plus various collections of metadata (some of which may be deprecated). Examples: + - [`bbfc`](metadat/bbfc) British Board of Film Classification's ratings for age-appropriateness. + - [`elspa`](metadat/elspa) Age-appropriateness/content ratings from the Entertainment and Leisure Software Publishers Association aka the Association for UK Interactive Entertainment ("Ukie"). + - [`fbneo-split`](metadat/fbneo-split) Includes an XML database (sourced from Logiqx's DTD ROM Management) for special use in arcade ROM scanning: it must be manually selected by the user when running a Manual Scan, it defines the component files within each ROM archive, and is not part of the `.rdb` compile. Also contains a typical `.dat`. + - [`mame`](metadat/mame) Similar to `fbneo-split` above. + - [`hacks`](metadat/hacks) Data for modified (or "hacked") versions of commercially released games. Many of these data are set by direct manual commits on the Libretro Github. + - [`homebrew`](metadat/homebrew) Data for non-officially-published games created by independent creators/programmers. + - [`libretro-dats`](metadat/libretro-dats) Ad hoc databases for items that were/are not covered by upstream database groups. Currently includes fan translations of SNES games, and an additional FDS dat that may be redundant with other sources. + - [`no-intro`](metadat/no-intro) Bulk import from upstream No-Intro databases. Generally non-disc-based systems. + - [`redump`](metadat/redump) Bulk import from upstream Redump databases. Generally disc-based systems. + - [`tosec`](metadat/tosec) Bulk import from upstream TOSEC databases. TOSEC data overlaps with and goes beyond other data sets (No-Intro, Redump), but has lower [precedence](#precedence) in libretro and so generally serves as a secondary stopgap. + - And more - [`rdb`](rdb) The compiled RetroArch database files - [`scripts`](scripts) Various scripts that are used to maintain the database files +## Fields & Headers + +### Key Field +The key field for matching varies by console typical file size (i.e. original media type). + +- __CRC checksum__ for systems with smaller file sizes, e.g. games before the advent of disc-based consoles. +- __Serial Number__ for larger files like disc-based games, to avoid computing checksums on large files. Found within the ROM file. The serial is not metadata but encoded within the game's binary data, which is scanned (in applicable cases) as a byte array by RetroArch. + +CRC and serial also serve as RetroArch's primary index. + +Current [build script code](https://github.com/libretro/libretro-super/blob/master/libretro-build-database.sh#L245) can be viewed as a reference for which type of key field RetroArch uses for each console system. + +### Fields Specified in Game Information Databases + +Database entries for games at minimum specify 1) a game's name, i.e. the display name that RetroArch will assign in playlists and 2) [key field](#key-field) data for matching/indexing and for identifying a file. Further optional metadata may appear. For reasons of informational completeness, future-proofing, and compatibility outside RetroArch, databases contain checksum and cryptographic hashes regardless of the key used for matching. + +Example of database entry within [`metadat/no-intro/Atari - 2600.dat`](https://github.com/libretro/libretro-database/blob/master/metadat/no-intro/Atari%20-%202600.dat) for the European region version of _Asteroids_: +``` +game ( + name "Asteroids (Europe)" + description "Asteroids (Europe)" + region "Europe" + rom ( name "Asteroids (Europe).a26" size 8192 crc 0A2F8288 md5 8CF0D333BBE85B9549B1E6B1E2390B8D sha1 1CB8F057ACAD6DC65FEF07D3202088FF4AE355CD ) +) +``` +If other `Atari - 2600.dat` files exist in the repository and contain further metadata for the same crc, the data would be compiled together in the `.rdb`. For example, [`metadat/developer/Atari - 2600.dat`](https://github.com/libretro/libretro-database/blob/master/metadat/developer/Atari%20-%202600.dat#L296) would confer `developer "Atari"` to the above data. + +### Header Guidelines for DATs + +The `description " "` and `comment " "` fields within a libretro dat's `clrmamepro ( )` header should be used to clarify the origin, source, and/or purpose of the data and file. The description and comment header fields are __intended for documentation__ purposes, are ignored by RetroArch, and can be freely changed without issue. For example, if a .dat includes 3rd party upstream data processed through a github author's build/scrape script(s), the comment and description (or other appropriate header fields) should contain information about _both_ those aspects of the dat's origin. If the .dat file is meant to cover a particular niche of data, the description field should explain it. + +The `name` field (and filename) of a `.dat` file header should match the `database` field that is specified in the [.info file for the cores that use it](https://github.com/libretro/libretro-super/tree/master/dist/info) (often but not always `Manufacturer - Systemname` or similar). The `description` field should be descriptive and informative about the `.dat` file's origin and purpose. + +## Precedence +Databases earlier in the list have precedence over items later in the list. E.g. definitions in `/dat` will over-ride `/metadat` in the final `.rdb` compile if any info conflicts for the same game (i.e. for the same key field). + ## Sources -Generally, RetroArch's scanner is configured for ROMs that have been validated by [No-Intro](http://datomatic.no-intro.org) or Redump DAT files but many other source databases are also in use. +Many source databases are in use as listed below. The table focusses on the 3rd party sources that predominantly cover each specific console library, but other/multiple sources including manual github contributions are maintained and all are compiled together in the final `.rdb` files (see [Repository Folder Guide](#folder-guide) and each dat's github History for details). ">" signs below indicate the [precedence](#precedence) order when multiple sources overlap for the same subset of games/data. |System|Source|Repository| |----|---|---| @@ -146,7 +218,11 @@ Generally, RetroArch's scanner is configured for ROMs that have been validated b |Watara - Supervision|[No-Intro](http://datomatic.no-intro.org)|[libretro-dats](https://github.com/robloach/libretro-dats)| |Wolfenstein 3D| | -## Building +__Pre-emptive Databases__. Some databases are maintained even if RetroArch currently has no core for the games/system, e.g. GP32, Vita, Original Xbox, and PS3. + +## Maintenance / Technical Usage + +### Building To build a complete set of RDB files for RetroArch or to generate a single RDB file, see [RetroArch/libretro-db/README.md](https://github.com/libretro/RetroArch/blob/master/libretro-db/README.md). @@ -156,7 +232,7 @@ Alternatively, you can run the following command to rebuild all the RDBs locally make build ``` -## Testing +### Testing Make sure filenames are Windows file system compatible, and are not too long (eg. [ecryptfs limits filenames to 143 characters](https://unix.stackexchange.com/questions/32795/what-is-the-maximum-allowed-filename-and-folder-size-with-ecryptfs/32834#32834))... @@ -164,6 +240,29 @@ Make sure filenames are Windows file system compatible, and are not too long (eg find -exec basename '{}' ';' | egrep '^.{144,}$' ``` +## Contributions + +### Small-Scale Corrections + +A vast majority of the database's game information originates from routine imports from upstream data groups (No-Intro, Redump, TOSEC, GameTDB, etc). In cases where the `.dat` for the entry at issue originates from an upstream group, best practice is for a contributor to go through the channels/process of that group. Upstream changes made by the database groups will eventually be imported to the Libretro databases. A seemingly helpful "fix" to Libretro's copy of the database would be overwritten and lost by the next import from upstream. + +In cases where the `.dat` in question is created and maintained by Libretro or does not receive bulk over-writes, github contributions are accepted. Refer to the [repository folder guide](#folder-guide) above and to github Histories for information about which libretro databases are applicable for github contributions. + +### Folder Structure Revisions + +The [build script](https://github.com/libretro/libretro-super/blob/master/libretro-build-database.sh) specifies exact `.dat` files and folders in the repository, therefore organizational housekeeping revisions to the file/folder structure (e.g. combining two metadata fragments into one unified folder and file) require corresponding revisions in the build script. + +### Adding A New Database + +- Create new `metadata/nameofnewdatabasefolder` and appropriately named system `.dat` file(s) e.g. `Sony - PlayStation.dat` with new data +- Add the new entry to `libretro-build-database.sh` +- Run ``make build`` to build the RDB files +- New types for RetroArch's `Explore` tab require updates to RetroArch code. + +## Databases and RetroArch Thumbnails + +Currently there is no automatic correspondence between game name updates in databases and image filename updates in the thumbnail repository, so database updates may break thumbnail retrieval. See the [Thumbnail Repository Readme](https://github.com/libretro-thumbnails/libretro-thumbnails#libretro-thumbnails) and [How to Contribute to Thumbnails Guide](https://docs.libretro.com/guides/roms-playlists-thumbnails/#contributing-thumbnails-how-to) for documentation about thumbnail handling. + ## Integrations There are a few tools out there that allow integration with libretro's database.