Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
vgocoder committed Feb 17, 2022
0 parents commit 4619a87
Show file tree
Hide file tree
Showing 4 changed files with 80 additions and 0 deletions.
3 changes: 3 additions & 0 deletions .private-env
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# fscrawler
export ELASTIC_VERSION=7.17.0
export FSCRAWLER_VERSION=2.10-SNAPSHOT-ocr-es6
30 changes: 30 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# docker-compose-fscrawler

> Mostly inspired by [fscrawler docs](https://fscrawler.readthedocs.io/en/latest/dev/doc.html)

## What
> You can build a basic search engine using elasticsearch & fscrawler. Quickly start up this using docker compose.

## How to use
### Source version env file

```
# export ELASTIC_VERSION=7.17.0
# export FSCRAWLER_VERSION=2.10-SNAPSHOT-ocr-es6
source .private-env
```

### Run elasticsearch.

```
docker-compose up -d elasticsearch
docker-compose logs -f elasticsearch
```

### Run fscrawler

```
docker-compose up fscrawler
```
4 changes: 4 additions & 0 deletions config/job_name/_settings.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
name: "job_name"
elasticsearch:
nodes:
- url: "http://elasticsearch:9200"
43 changes: 43 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
version: '3'
services:
# Elasticsearch Cluster
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:$ELASTIC_VERSION
container_name: elasticsearch
environment:
- bootstrap.memory_lock=true
- discovery.type=single-node
restart: always
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- data:/usr/share/elasticsearch/data
ports:
- 9200:9200
networks:
- fscrawler_net

# FSCrawler
fscrawler:
image: dadoonet/fscrawler:$FSCRAWLER_VERSION
container_name: fscrawler
restart: always
volumes:
- ${PWD}/config:/root/.fscrawler
- ${PWD}/logs:/usr/share/fscrawler/logs
- ../../test-documents/src/main/resources/documents/:/tmp/es:ro
depends_on:
- elasticsearch
command: fscrawler --rest idx
networks:
- fscrawler_net

volumes:
data:
driver: local

networks:
fscrawler_net:
driver: bridge

0 comments on commit 4619a87

Please sign in to comment.