-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
10 changed files
with
164 additions
and
36 deletions.
There are no files selected for viewing
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# CRNN for captcha recognition | ||
## General | ||
This is a simple PyTorch implementation of OCR system using CNN + RNN + CTC loss for captcha recognition. | ||
## Dataset | ||
I used CAPTCHA Images dataset which was downloaded from https://www.kaggle.com/fournierp/captcha-version-2-images | ||
## Files | ||
|
||
``` | ||
. | ||
├── data | ||
│ └── CAPTCHA Images | ||
│ ├── test | ||
│ ├── train | ||
│ └── val | ||
├── dataset.py | ||
├── model.py | ||
├── output | ||
│ ├── log.txt | ||
│ ├── loss.png | ||
│ └── weight.pth | ||
├── predict.py | ||
├── README.md | ||
├── split_train_val_test.py | ||
├── train.py | ||
└── utils.py | ||
``` | ||
### Training | ||
``` | ||
python train.py | ||
``` | ||
Training and validation loss: | ||
|
||
![Image description](output/loss.png) | ||
### Testing | ||
``` | ||
python predict.py | ||
``` | ||
accuracy = 0.897 |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
import string | ||
|
||
import numpy as np | ||
from PIL import Image | ||
import torch | ||
import torchvision.transforms.functional as F | ||
import matplotlib.pyplot as plt | ||
from model import CRNN | ||
import os | ||
from tqdm import tqdm | ||
import glob | ||
from dataset import CaptchaImagesDataset | ||
from utils import LabelConverter | ||
from tqdm import tqdm | ||
|
||
|
||
if __name__ == '__main__': | ||
device = torch.device("cuda:0" if (torch.cuda.is_available()) else "cpu") | ||
label_converter = LabelConverter(char_set=string.ascii_lowercase + string.digits) | ||
vocab_size = label_converter.get_vocab_size() | ||
|
||
model = CRNN(vocab_size=vocab_size).to(device) | ||
model.load_state_dict(torch.load('output/weight.pth', map_location=device)) | ||
model.eval() | ||
|
||
correct = 0.0 | ||
image_list = glob.glob('data/CAPTCHA Images/test/*') | ||
for image in tqdm(image_list): | ||
ground_truth = image.split('/')[-1].split('.')[0] | ||
image = Image.open(image).convert('RGB') | ||
image = F.to_tensor(image).unsqueeze(0).to(device) | ||
|
||
output = model(image) | ||
encoded_text = output.squeeze().argmax(1) | ||
decoded_text = label_converter.decode(encoded_text) | ||
|
||
if ground_truth == decoded_text: | ||
correct += 1 | ||
|
||
print('accuracy =', correct/len(image_list)) |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters