Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update image (Ubuntu/Python/requirements) #399

Merged
merged 26 commits into from
Mar 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
7895f4d
1st step of rework Docker (update wst_base image)
HadronCollider Mar 9, 2024
81763d7
update dockerhub repo and tag for wst_base image in build script
HadronCollider Mar 9, 2024
2b2af94
update requirements (WIP) and fix using updated libs
HadronCollider Mar 9, 2024
7444699
update Dockerfile (rework + new base image + labels) and docker-compo…
HadronCollider Mar 9, 2024
1fb986b
update docker image naming rules
HadronCollider Mar 9, 2024
d84fc9e
update workflow step (building base image)
HadronCollider Mar 9, 2024
7330454
update db_versioning (if no version == LAST_VERSION)
HadronCollider Mar 9, 2024
7ccda4c
move denoiser module from playground
HadronCollider Mar 9, 2024
2525f08
move denoiser module 2.0
HadronCollider Mar 9, 2024
0adc439
set up pythpn requirement versions
HadronCollider Mar 9, 2024
d9c2a9f
update dockerfiles
HadronCollider Mar 9, 2024
f6f707e
disabled wrong test whisper for GitHub workflow
HadronCollider Mar 9, 2024
0a53514
update sh scripts for building base image
HadronCollider Mar 9, 2024
9a92766
rename base image build script
HadronCollider Mar 9, 2024
95624bb
set ASR_MODEL=medium for deploy (before transfer to .env)
HadronCollider Mar 9, 2024
17c68ac
rm ports to whisper service
HadronCollider Mar 9, 2024
1dacb4d
Update test run in GitHub actions
HadronCollider Mar 9, 2024
5728e9c
Update main.yml Run tests
HadronCollider Mar 9, 2024
4bd9c00
update restart.sh (remove apache/ssl setup; update docker build command)
HadronCollider Mar 9, 2024
3384b42
Update docker tag in workflow (main)
HadronCollider Mar 12, 2024
45b70dd
merge whisper_deploy
HadronCollider Mar 20, 2024
c82a185
Merge branch 'whisper_deploy' of github.com:OSLL/web_speech_trainer i…
HadronCollider Mar 31, 2024
63ce34d
add_label_merge_conflicts
HadronCollider Mar 31, 2024
0dbb056
add local import for websockets (VoskAudioRecognizer)
HadronCollider Mar 31, 2024
d9bc46c
Merge pull request #402 from OSLL/add_label_merge_conflicts
HadronCollider Mar 31, 2024
fbec381
Merge branch 'master' into update_image
HadronCollider Mar 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@ venv
.idea
.ssl
__pycache__

Dockerfile*
app/playground
15 changes: 15 additions & 0 deletions .github/workflows/label_merge_conflicts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: 'Check for merge conflicts'

on: [push]

jobs:
find_conflicts:
runs-on: ubuntu-20.04

steps:
- uses: mschilde/auto-label-merge-conflicts@master
with:
CONFLICT_LABEL_NAME: "has conflicts"
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
MAX_RETRIES: 5
WAIT_MS: 5000
6 changes: 3 additions & 3 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@ on: pull_request

jobs:
build:
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04

steps:
- uses: actions/checkout@v2

- name: Build system images (non-pulling)
run: |
# build base image
docker build -f Dockerfile_base -t osll/wst_base .
docker build -f Dockerfile_base -t dvivanov/wst-base:v0.2 .
- name: Decreasing whisper model for tests
run: |
Expand All @@ -32,4 +32,4 @@ jobs:
run: |
docker ps -a
docker-compose logs
docker exec web_speech_trainer_web_1 bash -c 'cd /app/tests && pytest .'
docker exec web_speech_trainer_web_1 bash -c 'cd /project/tests && pytest .'
22 changes: 10 additions & 12 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,17 +1,15 @@
FROM osll/wst_base:v0.1
FROM dvivanov/wst-base:v0.2

RUN apt update
LABEL version="0.2"
LABEL project="wst"

# The installation of `fitz` library is really tricky.
# The library uses `frontend` internal package that can be obtained
# via installation of `PyMuPDF` package but `PyMuPDF` itself requires `fitz`.
# That's why `fitz` is installed separately.
RUN pip3 install fitz==0.0.1.dev2
WORKDIR /project

COPY requirements.txt requirements.txt
RUN pip3 install --ignore-installed --no-cache-dir -r requirements.txt

WORKDIR /app
COPY . .
RUN rm -rf /project/tests/selenium

RUN pip3 install -r requirements.txt
ENV PYTHONPATH='/app/:/app/app/'
WORKDIR /app/app
CMD /bin/bash
ENV PYTHONPATH='/project/:/project/app/'
WORKDIR /project/app
29 changes: 13 additions & 16 deletions Dockerfile_base
Original file line number Diff line number Diff line change
@@ -1,19 +1,16 @@
FROM ubuntu:18.04
FROM ubuntu:22.04
ENV LANG C.UTF-8
RUN apt-get update && apt-get install -y software-properties-common
RUN apt-get install -y libgconf2-4 libnss3 libxss1 python3-pip vim ffmpeg exiftool inkscape mupdf mupdf-tools wget unzip
WORKDIR /usr/local/bin
RUN wget https://chromedriver.storage.googleapis.com/90.0.4430.24/chromedriver_linux64.zip
RUN unzip chromedriver_linux64.zip
RUN wget https://mirror.kraski.tv/soft/google_chrome/linux/90.0.4430.72/google-chrome-stable_90.0.4430.72-1_amd64.deb
RUN apt-get install -y ./google-chrome-stable_90.0.4430.72-1_amd64.deb
RUN pip3 install --upgrade pip==21.3.1
RUN pip3 install --upgrade setuptools

# for DB dumps
RUN apt install -y sudo zip mongodb-clients
LABEL version="0.2"
LABEL project="wst"

# for pptx/odp support
RUN add-apt-repository ppa:libreoffice/ppa
RUN apt update
RUN apt install -y unoconv
RUN apt update && apt install -y software-properties-common
RUN add-apt-repository ppa:libreoffice/ppa && apt update

RUN apt install -y --no-install-recommends libgconf-2-4 libnss3 libxss1 libmagic1 python3-pip python3-dev ffmpeg exiftool inkscape mupdf mupdf-tools libmagic1 \
nano libreoffice-impress default-jre

RUN pip3 install --upgrade pip

COPY requirements.txt requirements.txt
RUN pip3 install --ignore-installed --no-cache-dir -r requirements.txt
17 changes: 17 additions & 0 deletions Dockerfile_test
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM selenium/standalone-chrome:121.0-chromedriver-121.0-grid-4.18.0-20240220

WORKDIR /usr/src/project

USER root
RUN apt-get update && \
apt-get install -y python3 python3-pip && \
rm -rf /var/lib/apt/lists/*

COPY tests/requirements.txt requirements.txt
RUN pip install -r requirements.txt

COPY tests/selenium .

ENV PYTHONPATH='/project/:/project/app/'

ENTRYPOINT pytest .
2 changes: 1 addition & 1 deletion app/api/dump.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ def download_dump(backup_name: str) -> (dict, int):
if backup_name not in backup_filenames or not os.path.isfile(filepath):
return {'message': "No such backup file: {}".format(backup_name)}, 404

return send_file(filepath, as_attachment=True, attachment_filename=backup_name)
return send_file(filepath, as_attachment=True, download_name=backup_name)


def create_db_dump(name):
Expand Down
2 changes: 1 addition & 1 deletion app/api/files.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ def get_presentation_record_file(presentation_record_file_id: str):

response = make_response(send_file(
presentation_record_file,
attachment_filename='{}.mp3'.format(presentation_record_file_id),
download_name='{}.mp3'.format(presentation_record_file_id),
as_attachment=as_attachment,
))

Expand Down
27 changes: 16 additions & 11 deletions app/api/sessions.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from app.lti_session_passback.auth_checkers import check_auth
from app.utils import DEFAULT_EXTENSION
from packaging import version as version_util
from ua_parser.user_agent_parser import Parse as user_agent_parse

api_sessions = Blueprint('api_sessions', __name__)
logger = logging.getLogger('root_logger')
Expand Down Expand Up @@ -36,26 +37,30 @@ def get_user_agent():
"""
if not check_auth():
return {}, 404

user_info = user_agent_parse(request.user_agent.string)
user_info['os']['family'] = user_info['os']['family'].lower()
user_info['user_agent']['family'] = user_info['user_agent']['family'].lower()
response = {
'platform': request.user_agent.platform,
'browser': request.user_agent.browser,
'version': request.user_agent.version,
'platform': user_info['os']['family'],
'browser': user_info['user_agent']['family'],
'version': user_info['user_agent']['major'],
'message': 'OK',
'outdated': False,
'supportedPlatforms': list(Config.c.user_agent_platform.__dict__.keys()),
'supportedBrowsers': Config.c.user_agent_browser.__dict__,
}
if request.user_agent.platform not in Config.c.user_agent_platform.__dict__:
if user_info['os']['family'] not in Config.c.user_agent_platform.__dict__:
response['outdated'] = True
browser_found = False
for (browser, version) in Config.c.user_agent_browser.__dict__.items():
if request.user_agent.browser == browser:
browser_found = True
if version_util.parse(request.user_agent.version) < version_util.parse(version):

user_browser_name = user_info['user_agent']['family']
if user_browser_name in Config.c.user_agent_browser.__dict__:
version = Config.c.user_agent_browser.__dict__[user_browser_name]
if version_util.parse(user_info['user_agent']['major']) < version_util.parse(version):
response['outdated'] = True
break
if not browser_found:
else:
response['outdated'] = True

return response, 200


Expand Down
4 changes: 3 additions & 1 deletion app/audio_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,9 @@ def _try_extract_and_process(self):
self._hangle_error(training_id, verdict)
return
try:
audio_length = librosa.get_duration(filename=presentation_record_file)
audio_length = librosa.get_duration(path=presentation_record_file)
logger.info(f'audio record length: {audio_length} s')

start_time = time.time()

recognized_audio = self._audio_recognizer.recognize(presentation_record_file)
Expand Down
6 changes: 3 additions & 3 deletions app/audio_recognizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,13 @@
import wave

import requests
import websockets

from app import utils
from app.recognized_audio import RecognizedAudio
from app.recognized_word import RecognizedWord
from app.word import Word
from app.root_logger import get_root_logger
from playground.noise_reduction.denoiser import Denoiser
from app.word import Word
from denoiser import Denoiser

logger = get_root_logger(service_name='audio_processor')

Expand Down Expand Up @@ -101,6 +100,7 @@ def recognize(self, audio):

async def send_audio_to_recognizer(self, file_name):
recognizer_results = []
import websockets
async with websockets.connect(self._host) as websocket:
wf = wave.open(file_name, "rb")
await websocket.send('''{"config" : { "sample_rate" : 8000.0 }}''')
Expand Down
2 changes: 1 addition & 1 deletion app/db_versioning/db_versioning.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def update_db_version():
version_doc = DBCollections().db_version.find_one()

if not version_doc:
version_doc_id = add_version(VERSIONS['1.0']) # if no version == 1.0
version_doc_id = add_version(VERSIONS[LAST_VERSION]) # if no version == LAST_VERSION
version_doc = DBCollections().db_version.find_one({
'_id': version_doc_id})
version_doc_id = version_doc['_id']
Expand Down
Loading
Loading