Skip to content

Tutorial (Indexers, newznab, API, *arr, etc.)

theotherp edited this page Apr 27, 2020 · 9 revisions

The following is meant as a introduction to some of the concepts you need to understand to properly use NZBHydra. Some of the chapters are kept intentionally short as the focus lies somewhere else.

How do indexers work?

Indexers scrape the usenet for nice stuff (mostly Movies, TV shows and porn, but also games, apps and music to a lesser degree). These are uploaded split in many parts and are often uploaded under names that a) do not make the easily discoverable and b) makes it hard to find out what exactly they contain. So they might find a couple of ZIP files but they're named "ABCD123", so you don't know if it's a movie or whatever. They may actually download the ZIPs and see inside it. But many uploads are actually named more or less like their content. It's up to the indexers to find these things and index them, hence the name. For each release they create an NZB file which is a file containing all the information you need to download this release. You don't need to know anything about what this file looks like or how it's created. In the simplest form the indexer just saves this file and some basic information about the release (size, upload date, etc). This is what raw search engines like Binsearch do. You can search by the names of these releases and filter by size and age, but if the release is named weirdly you probably won't find it. And if you search for "lost" you may be, well, lost, because there's a lot of stuff out there with that in its name.

"Proper" indexers will try to find out for each release what it actually contains and try to assign it proper metadata, e.g. find the movie and save that information along in the database. If it finds an episode of "Lost" it will add it to its internal list of "Lost" episodes, along with season and episode number. That way you can go to their website and search for "Lost, Season 1, Episode 12" and find releases for exactly that. If you tried to do that with a raw search engine you'd need to enter "Lost s01e12" and would miss any releases that are named "Lost 1x12", for example.

Where do the indexers get all the information about movies and TV shows? From metadata providers like TheTVDB or https://www.imdb.com/. The indexers will not only save the metadata but also the ID for the movie or TV show. The TVDB ID for Lost is 73739.

What's an API?

Every indexer provides an API (Application programming interface) endpoint, a certain URL that can be called to programmatically retrieve the indexer's releases and search them. This API is described by the [https://newznab.readthedocs.io/en/latest/misc/api/](newznab spec) but it should be noted that no indexer actually implements all the functions described there. The API is only meant to be called by programs like NZBHydra or Sonarr. Indexers usually limit the access to the API to a certain amount of hits per day (because each API request takes a little bit of processing power). Many indexers allow a couple of hits for free users and thousands for paying VIP users.

An API search URL might look like this: https://www.indexer.com/api?apikey=someapikey&t=search&q=whatever.

API searches (the API also allows downloading and some other stuff, but we'll ignore that) can be made using several search parameters which determine which results are returned.

Categories

Each release is a assigned a category. These categories are predefined to a certain degree (some indexers invent new ones). Each category has a fixed number which identifies this category for searches. The categories are split into main and subcategories. Main categories are for example "Movies" (2000), "TV" (5000), "Audio" (3000). Each main category has several subcategories. "Movies" has the subcategories "HD" (2040), "SD" (2030) and others, "TV" has the subcategories "HD" (5040), "SD" (5030) and others. You can already see that the subcategories always start with the same digit as their main category because they're subcategory. If an API search is made with the parameter cat=2000 that means that only Movie results should be returned (so any results with a category that starts with "2"). The same way if an API search is made with the parameter cat=2040 that means that only HD Movie results should be returned. It's also possible to combine multiple categories: cat=2000,5000 will only return Movie and TV results, cat=2010,2030 will only return foreign and "other" movies (whatever that is) (so movies that have either category assigned). You can see that if cat=2000 returns any results with a category that starts with "2" it doesn't make any sense to search for cat=2000,2010,2020,2030 - that's the same as searching for cat=2000.

Search types

API searches can be made using several functions. These determine how results are searched and what parameters can be added to the search. The search type is defined by the t parameter in the URL.

SEARCH

This is a search in its most basic form (t=search). You can provide a simple text based query using q=whatever which would limit returned results to those with "whatever" in their name. But even that parameter is optional. It's possible just to search t=search&cat=2000 to get a list of the latest movies. This is called an "update query" in NZBHydra because it doesn't search for anything in particular. That's the kind of query periodically made by Sonarr just to keep up-to-date.

It's important to understand that the search function with a query parameter (q=whatever) doesn't use any special logic. It just searches in the release title, nothing else. This involves all the downsides described above, i.e. you might miss releases if you use the wrong words or have too many false positives if your query is too generic.

MOVIE / TVSHOW

So the indexer has already indexed all its releases and assigned meta data and knows exactly which TV show a release contains. Wouldn't it be nice to search for that exact TV show (or even episode)? That can be achieved by using the specific search types t=tvsearch (or t=movie). This allows providing a media ID (as described above) that specifies what you're looking for. t=tvsearch&tvdbid=73739 will search for all episodes of Lost, t=tvsearch&tvdbid=73739&cat=5040&season=1&ep=12 will search for all HD releases of Season 1, Episode 12 of Lost.

The same goes for movies, search for t=movie&imdbid=tt0076759 to only find Star Wars releases.

There are several media ID types and not all of them are supported by all indexers:

  • IMDB ID (for movies). Nearly every indexer supports this.
  • TheTVDB ID (for TV shows). Nearly every indexer supports this.
  • TVmaze (for TV shows). Many indexers support this.
  • The Movie Database. Many indexers support this.
  • IMDB ID (for TV shows). Few indexers support this.
  • TVRage. This was a TV show meta data provider that's been offline for a while. Still supported by many indexers for older TV shows.

It's also possible that a search type is supported but no ID. That means that you can search specifically for movies or TV shows but only using plain text queries.

MUSIC / BOOK

Same as for TV shows and movies the spec also defines searches for music and books using specific search parameters (like author & title or artist & album). Some indexers support these. I can't say how good the results are.

What does all that mean for NZBHydra, Sonarr, etc?

NZBHydra can be used two ways:

  1. As an "artificial" indexer that you plug in Sonarr.
  2. As a GUI to manually search all your indexers in one place.

Either way you have to indexer all your indexers into NZBHydra. An automatic "caps check" will determine which of the described search types (SEARCH, TVSHOW, MOVIE, AUDIO, BOOK) and which of the search IDs (TVDB, IMDB, etc.) are supported by the indexer. This is done via "brute force". NZBHydra will execute a search for each of the types and IDs and check if at least 90% of the returned results match the search. That way we can be sure that the type/ID is actually supported. If the caps check does not determine a certain search type supported that does not mean that the indexer won't return any results in this area. So an indexer that doesn't support AUDIO will certainly return audio releases, you just can't search the indexer by artist and album or such.

So let's say you entered NZBHydra into Sonarr. Sonarr makes an update query every 30 minutes or so. NZBHydra queries all its configured indexers, aggregates the results, removes duplicates, filters out some results (if you configured any filters in the config) and return the list to Sonarr which may ask for another batch of results. That's the "update query" described above. Now let's say you're missing a certain episode. You can trigger this search manually but Sonarr will also execute a backlog search now and then. It will call NZBHydra searching for this particular show and episode, e.g. t=tvshow&tvdbid=73739&season=1&episode=12. NZBHydra will search all indexers which support this search type and ID but will also ignore any which don't support that.

Query generation

To "fix" this you can enable query generation. That means NZBHydra will convert the ID into a title and, if needed, add season and episode to a query and make a text based query using the SEARCH function. In the example above the indexer will be queried using Lost s01e12. It's also possible to enable this only as a fallback which means that an indexer supporting the search type and ID will be searched using these and, if it doesn't return any results, NZBHydra will then execute a search using the generated query.

Manual configuration of caps

It's almost never a good idea to manually change the search types and IDs that NZBHydra determined to be supported by an indexer. If you remove any you will get less results and if you add any that aren't actually supported you will get errors.

Torrent trackers and torznab

Most torrent trackers work completely different than indexers. They don't index stuff, every torrent is usually manually uploaded by somebody, but it may also be scraped from other indexers. They rarely have any of the metadata the indexers have, so they don't know what TV show a certain torrent is for. There are private trackers that do stuff usenet users can only dream of, but that's another story.

Torrent trackers also usually don't have an API to be searched. To fix that torznab was invented, which is basically a slightly modified newznab format to translate tracker searches to a format that can be programmatically read. The most popular program to provide this API access is Jackett. You can configure trackers there and they can then be called NZBHydra or Sonarr. Jackett will execute the search against the trackers, translate the results and return a torznab result. Jackett is basically for torrent trackers what NZBHydra is for usenet indexers.

Due to their nature these trackers often don't support any search types or perhaps only one and rarely any IDs. NZBhydra allows to read the jackett config and automatically add all its configured trackers. In this case the supported search types and IDs are pulled from the config and not determined by brute force.