Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File analysis worker pool #347

Merged
merged 5 commits into from
Dec 23, 2024
Merged

File analysis worker pool #347

merged 5 commits into from
Dec 23, 2024

Conversation

tphakala
Copy link
Owner

Implement parallel audio chunk analysis

Description

This PR implements a concurrent processing system for audio chunk analysis to improve performance and CPU utilization. The new implementation uses a worker pool pattern with goroutines to process multiple audio chunks simultaneously.

Key Changes

  • Implemented a worker pool for parallel BirdNET analysis
  • Added buffered channels for efficient chunk processing
  • Introduced configurable number of worker threads (1-8)
  • Added debug logging for troubleshooting
  • Improved CPU utilization during analysis

Technical Details

  • Uses buffered channels for audio chunk distribution
  • Worker count is determined by BirdNET.Threads setting:
    • If set to 0, uses system CPU count
    • Clamped between 1 and 8 workers
    • Each worker runs in its own goroutine
  • Implements three main goroutine types:
    1. Worker routines for BirdNET analysis
    2. Progress monitoring routine
    3. Result collector routine

Performance

The parallel implementation should provide better performance on multi-core systems by:

  • Processing multiple chunks simultaneously
  • Maintaining a constant flow of work through buffered channels
  • Balancing CPU load across available cores

Configuration

No new configuration options required. Existing BirdNET.Threads setting controls the number of worker threads:

birdnet:
threads: 0 # 0 = auto (use system CPU count, max 8)

Debug Support

Added comprehensive debug logging (enabled via debug: true in config) to help troubleshoot processing issues:

  • Worker initialization and completion
  • Chunk processing status
  • Channel operations
  • Error handling

- Introduced comprehensive unit tests for the SunCalc package, covering the creation of SunCalc instances and retrieval of sun event times.
- Implemented tests for sunrise and sunset time retrieval, ensuring that the returned times are valid and not zero.
- Added cache consistency tests to verify that cached sun event times are correctly stored and retrieved.
- These tests enhance the reliability of the SunCalc functionality and ensure accurate calculations for given coordinates.
- Implemented concurrent processing of audio chunks using goroutines, allowing for more efficient analysis with a configurable number of workers (1 to 8).
- Added detailed debug logging throughout the audio processing workflow to aid in monitoring and troubleshooting.
- Replaced custom min and max functions with a single clampInt function to ensure values are constrained between specified limits (1 to 8 workers).
Copy link
Contributor

coderabbitai bot commented Dec 23, 2024

Walkthrough

The pull request introduces significant enhancements across multiple files, primarily focusing on audio processing and testing. In internal/analysis/file.go, the processAudioFile function is refactored to improve concurrency and error handling, incorporating a worker pool and buffered channels. A new clampInt function is added to manage the number of workers. The internal/suncalc/suncalc_test.go file is updated with a suite of unit tests for the suncalc package, validating the functionality of sun event time calculations and ensuring cache consistency. Additionally, modifications are made to audio reading functions in the internal/myaudio package.

Changes

File Change Summary
internal/analysis/file.go - Added clampInt function to validate worker count
- Refactored processAudioFile for concurrent chunk processing
- Introduced buffered channels for worker communication
- Enhanced error handling and progress monitoring with timeout
internal/myaudio/readfile.go - Removed console output from ReadAudioFileBuffered function
internal/myaudio/readfile_flac.go - Modified condition to process last audio chunk in readFLACBuffered function
internal/myaudio/readfile_wav.go - Modified condition to process last audio chunk in readWAVBuffered function
internal/suncalc/suncalc_test.go - Added comprehensive unit tests for SunCalc methods
- Tested NewSunCalc constructor
- Validated sun event time calculations
- Verified cache consistency

Sequence Diagram

sequenceDiagram
    participant Main as Main Routine
    participant Workers as Worker Goroutines
    participant Progress as Progress Monitor

    Main->>+Workers: Distribute Audio Chunks
    Workers-->>Main: Process Chunks Concurrently
    Progress->>Main: Monitor Processing Progress
    Workers-->>Main: Return Results/Errors
    Main->>-Main: Aggregate Results
Loading

Poem

🐰 A Tune of Change, So Bright and Clear,
With workers busy, never fear!
Chunks of audio, processed with grace,
In harmony, they find their place.

🌟✨ Let the sun shine on our tests so bold,
In rabbit's world, new stories unfold!

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
internal/analysis/file.go (3)

99-104: Ensure consistent worker clamping
The logic that uses Threads and then clamps the value between 1 and 8 is clear. However, confirm that any adjacent code (such as the progress monitoring or memory usage) also accounts for this range.


106-108: Consider adding finer-grained debug logging
The debug message is helpful to confirm the number of workers and total chunks being processed, but you may also want to show any derived or computed values such as chunk size or other boundary conditions to aid troubleshooting.


157-175: Progress monitoring goroutine
Regularly updating the terminal with the progress is helpful. However, be mindful of overhead in high-volume scenarios. If the ticker interval is too small and the number of chunks is large, the frequent I/O might slightly degrade performance.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89d3fc4 and f5cd31f.

📒 Files selected for processing (2)
  • internal/analysis/file.go (3 hunks)
  • internal/suncalc/suncalc_test.go (1 hunks)
🔇 Additional comments (12)
internal/analysis/file.go (7)

93-95: File name truncation logic looks good
Truncating the file name to 30 characters is a straightforward approach to avoid overly long console output. This is a user-friendly enhancement.


116-116: Robust concurrency approach
The worker goroutines correctly process chunks in parallel and post results or errors to dedicated channels. The early return upon error ensures immediate notification and helps avoid partial or stalled processing.

Also applies to: 119-155


177-204: Result collector sequence
Collecting results in a single goroutine prevents concurrency issues when appending to allNotes. This is a good approach to avoid data races. The immediate close of doneChan on error is also a sound design choice.


206-239: Buffered file reading and chunk sending
The code properly streams audio chunks and blocks on chunkChan if all workers are busy. This ensures efficient CPU utilization without reading too far ahead. The overlap accounting in line 220 is a neat detail.


300-309: Clamp function is straightforward
The clampInt utility is concise and ensures numWorkers remains within valid bounds. This is a clean and reusable snippet for similar range-clamping needs.


7-7: Use caution with runtime imports on all systems
While using runtime.NumCPU() is helpful to determine the system CPU count, be aware of potential variations in containerized or virtualized environments, which may not always report the actual hardware CPU cores. Ensure testing is performed in those environments if relevant.


110-115: Channel buffering strategy
Using a buffer size of 4 for chunkChan and resultChan is a reasonable start but might need tuning based on typical chunk durations, available RAM, and concurrency levels. Keep an eye on how the worker pool handles bursty conditions.

✅ Verification successful

Let me gather more information about the channel usage and concurrency patterns in the codebase.

The scripts will help us understand:

  1. The goroutine patterns and how these channels are used
  2. Any surrounding context about channel usage
  3. Whether there are any configurable buffer sizes or constants defined elsewhere

This will help us verify if the buffer size of 4 is appropriate or needs adjustment based on the implementation details.


Let me gather more information about the worker pool and concurrency patterns to understand the channel usage better.

This will help us understand:

  1. The worker pool implementation and size
  2. Any configuration related to concurrency
  3. The chunk processing logic and how these channels are used in practice

Let me gather one more piece of information about the chunk processing and buffer usage.

This will help us understand:

  1. The complete worker pool initialization context
  2. Any settings that might affect chunk processing and buffer sizes

Based on the gathered information, I can now provide a final response about the channel buffering strategy.

Buffer size of 4 is appropriate for the worker pool design
The buffer size of 4 for chunkChan and resultChan is well-aligned with the worker pool implementation, which uses between 1-8 workers (clamped based on CPU cores). Since the worker count is capped at 8, a buffer size of 4 provides a good balance between memory usage and preventing worker starvation, as it can hold half of the maximum possible concurrent operations.

The decision is supported by these findings:

  1. The worker pool size is intelligently managed: numWorkers = clampInt(numWorkers, 1, 8)
  2. The buffer size (4) is half of the maximum worker count (8), which is a common pattern to balance memory usage and throughput
  3. The error channel has a buffer of 1, which is appropriate since we only need to handle one error to stop processing
  4. The done channel is unbuffered, which is correct for synchronization signals
🏁 Scripts executed

The following scripts were executed for the analysis:

Error: No shell scripts found


Script:

#!/bin/bash
# Search for worker pool implementation and channel usage patterns
ast-grep --pattern 'go func() {
  $$$
}'

# Look for any configuration related to these channels
rg -A 5 "chunkChan|resultChan|errorChan|doneChan"

# Search for any constants or configuration variables that might affect the buffer size
rg "const.*[Bb]uffer|var.*[Bb]uffer"

Length of output: 70512


Script:

#!/bin/bash
# Search for worker pool initialization and usage
ast-grep --pattern 'for $_ := range $workers {
  $$$
}'

# Look for worker pool size configuration or constants
rg "workers|numWorkers|poolSize|concurrency" internal/analysis/

# Look for chunk processing related code
ast-grep --pattern 'ProcessChunk($$$)'

Length of output: 1141


Script:

#!/bin/bash
# Get more context around the worker pool and chunk processing
rg -A 10 "numWorkers := settings.BirdNET.Threads" internal/analysis/file.go

# Look for chunk size or processing related configuration
rg -A 5 "type.*Settings struct" internal/analysis/

Length of output: 476

internal/suncalc/suncalc_test.go (5)

8-22: TestNewSunCalc usage
The test ensures the constructor is returning a valid instance and that the observer's latitude/longitude fields match expectations. This is a solid smoke test to confirm initialization.


24-64: Sun event times and caching
Verifies that sunrise/sunset/civil dawn/dusk values are non-zero and that subsequent calls return cached values. This test thoroughly checks correctness and caching behavior.


66-78: TestGetSunriseTime
Checks for errors and ensures the returned time is non-zero. Simple and effective.


80-92: TestGetSunsetTime
Similar to sunrise test—ensures no errors and a valid time. Straightforward coverage.


94-121: TestCacheConsistency
Confirms that the date-based cache entry is created after a call to GetSunEventTimes, and it matches the computed values. This is excellent for maintaining consistency under repeated requests.

…andling

- Added moving average calculation for chunks processed per second to enhance performance monitoring during audio analysis.
- Implemented timeout handling for chunk processing and sending, preventing potential hangs and improving robustness.
- Updated progress logging to include average chunk processing rate, providing better insights into the analysis speed.
- Enhanced final output to display average chunks per second, offering a clearer summary of processing efficiency.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
internal/analysis/file.go (5)

116-120: Consider optimizing channel buffer sizes for better throughput.

The current buffer size of 4 might lead to unnecessary blocking. Consider increasing the buffer sizes based on the number of workers (e.g., 2 * numWorkers) to maintain a steady flow of work.

-	chunkChan := make(chan []float32, 4)
-	resultChan := make(chan []datastore.Note, 4)
+	chunkChan := make(chan []float32, 2*numWorkers)
+	resultChan := make(chan []datastore.Note, 2*numWorkers)

163-210: Consider reducing progress update frequency.

The current 100ms update interval might cause unnecessary CPU overhead and console flickering. Consider increasing it to 500ms or 1s for a better balance between responsiveness and overhead.

-		ticker := time.NewTicker(100 * time.Millisecond)
+		ticker := time.NewTicker(500 * time.Millisecond)

234-241: Review timeout duration for large audio chunks.

The 5-second timeout might be insufficient for processing large audio chunks on slower systems or under heavy load. Consider making this timeout configurable or increasing it to a more conservative value.

-			case <-time.After(5 * time.Second):
+			case <-time.After(15 * time.Second):

346-355: Consider making clampInt generic for broader reuse.

The function could be made generic to work with any ordered type, making it more reusable across the codebase.

-func clampInt(value, minValue, maxValue int) int {
+func clamp[T constraints.Ordered](value, minValue, maxValue T) T {
 	if value < minValue {
 		return minValue
 	}
 	if value > maxValue {
 		return maxValue
 	}
 	return value
 }

Add the following import:

import "golang.org/x/exp/constraints"

93-307: Excellent worker pool implementation with robust error handling.

The concurrent processing system is well-designed with several notable architectural strengths:

  • Proper separation of concerns between workers, progress monitoring, and result collection
  • Comprehensive error handling with timeouts
  • Efficient resource utilization with bounded worker count
  • Detailed debug logging for troubleshooting
  • Progress monitoring with performance metrics

The implementation successfully achieves the PR's objective of enhancing performance through parallel processing while maintaining reliability.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5cd31f and 8809674.

📒 Files selected for processing (1)
  • internal/analysis/file.go (3 hunks)
🔇 Additional comments (2)
internal/analysis/file.go (2)

249-269: LGTM! Robust chunk processing implementation.

The implementation includes proper timeout handling, correct chunk timing updates, and comprehensive error propagation.


272-307: LGTM! Clean implementation of cleanup and statistics.

The implementation properly handles cleanup, waits for completion, and provides useful performance statistics.

Comment on lines 105 to 114
// Determine number of workers (between 1 and 8)
numWorkers := settings.BirdNET.Threads
if numWorkers <= 0 {
numWorkers = runtime.NumCPU()
}
numWorkers = clampInt(numWorkers, 1, 8) // Ensure between 1 and 8 workers

// Set predStart to 0 time
if settings.Debug {
fmt.Printf("DEBUG: Starting analysis with %d total chunks and %d workers\n", totalChunks, numWorkers)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Worker count configuration needs alignment between packages

The codebase has two different approaches to handling thread/worker count:

  • internal/analysis/file.go clamps workers between 1-8
  • internal/birdnet/birdnet.go only caps at system CPU count with no lower bound

Consider aligning these implementations for consistency:

  • Either apply the 1-8 worker clamp in both places
  • Or document why the analysis package needs different bounds than the BirdNET package
🔗 Analysis chain

LGTM! Worker pool configuration looks good.

The worker count determination is well-implemented with appropriate bounds checking and fallback to CPU count. The debug logging provides good visibility into the configuration.

Let's verify the worker count configuration across the codebase:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any other worker pool implementations that might need similar bounds
rg -g '!{vendor,.*}' -A 3 'runtime\.NumCPU\(\)'

Length of output: 757


Script:

#!/bin/bash
# Let's check the implementation of thread configuration in birdnet package
ast-grep --pattern 'func $_($$$) int {
  systemCpuCount := runtime.NumCPU()
  $$$
}'

Length of output: 86


Script:

#!/bin/bash
# Let's check the entire content of the birdnet package's thread handling
rg -g '!{vendor,.*}' -B 3 -A 5 'configuredThreads'

Length of output: 967

- Set the number of workers to a fixed value of 1
- Updated chunk processing logic to ensure at least one chunk is processed, enhancing reliability.
- Adjusted the handling of audio chunks to ensure proper padding and processing of the last chunk
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
internal/analysis/file.go (2)

258-259: Add cleanup for goroutines

The code should ensure proper cleanup of all goroutines, especially in error cases.

Consider using a WaitGroup to track and clean up goroutines:

+var wg sync.WaitGroup
+wg.Add(numWorkers + 1) // workers + progress monitor
 // Start worker goroutines
 for i := 0; i < numWorkers; i++ {
     go func(workerID int) {
+        defer wg.Done()
         // ... worker logic ...
     }(i)
 }
 close(chunkChan)
-<-doneChan // Wait for processing to complete
+wg.Wait() // Wait for all goroutines to finish

Also applies to: 263-264


111-115: Consider adjusting channel buffer sizes

The current buffer size of 4 for channels might be too small for optimal performance with multiple workers.

Consider this approach:

-chunkChan := make(chan []float32, 4)
-resultChan := make(chan []datastore.Note, 4)
+// Buffer size based on number of workers
+bufferSize := numWorkers * 2
+chunkChan := make(chan []float32, bufferSize)
+resultChan := make(chan []datastore.Note, bufferSize)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8809674 and 3ced291.

📒 Files selected for processing (4)
  • internal/analysis/file.go (1 hunks)
  • internal/myaudio/readfile.go (0 hunks)
  • internal/myaudio/readfile_flac.go (1 hunks)
  • internal/myaudio/readfile_wav.go (1 hunks)
💤 Files with no reviewable changes (1)
  • internal/myaudio/readfile.go
🔇 Additional comments (2)
internal/myaudio/readfile_flac.go (1)

110-110: Verify the impact of processing small audio chunks on analysis quality

The condition has been broadened to process chunks of any size. While this ensures no audio data is lost, very small chunks might not provide meaningful analysis results.

Let's analyze the typical chunk sizes being processed:

✅ Verification successful

Let me gather more information about how these chunks are processed and what analysis is performed on them.


Let me try a different approach to understand the chunk processing logic.


Let me check one more thing about how these chunks are handled in the main processing loop.


Processing small chunks is safe due to padding mechanism

The code ensures analysis quality by:

  1. Padding any chunk smaller than secondsSamples (3 seconds) to reach the required length
  2. Using the same approach consistently in both WAV and FLAC processing
  3. Maintaining a minimum processing threshold of 1.5 seconds (minLenSamples) while allowing smaller final chunks

The change is safe because:

  • Small chunks are not processed as-is but padded to a consistent length
  • This maintains analysis quality while preventing data loss
  • The approach is consistent across audio formats
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for chunk size validations and minimum length requirements
rg -A 3 'minLenSamples.*:=.*conf\.SampleRate'

Length of output: 554


Script:

#!/bin/bash
# Search for the analysis functions that process these chunks
ast-grep --pattern 'func process$_Chunk($$$) {
  $$$
}'

# Also look for any minimum size checks or validations in the analysis
rg -B2 -A5 'currentChunk.*len.*minLenSamples'

Length of output: 105


Script:

#!/bin/bash
# Search for functions that handle these chunks and their processing
rg -B2 -A5 'func.*process.*Chunk'

# Look for any size-related validations or processing logic
rg -B2 -A5 'if.*len.*currentChunk.*>='

Length of output: 1221


Script:

#!/bin/bash
# Look for the main chunk processing loop and any size validations
rg -B5 -A5 'for.*range.*samples'

# Also check where the callback is defined
rg -B2 'type.*Callback'

Length of output: 263

internal/myaudio/readfile_wav.go (1)

125-125: LGTM! Consistent implementation across audio formats

The change maintains consistency with the FLAC implementation, ensuring uniform handling of audio chunks across different formats.

Comment on lines +241 to +252
// Read and send audio chunks with timeout
err := myaudio.ReadAudioFileBuffered(settings, func(chunk []float32) error {
select {
case chunkChan <- chunk:
// advance predStart by 3 seconds - overlap
predStart = predStart.Add(time.Duration((3.0 - bn.Settings.BirdNET.Overlap) * float64(time.Second)))
return nil
case <-doneChan:
return processingError
case <-time.After(5 * time.Second):
return fmt.Errorf("timeout sending chunk to processing")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add timeout handling for chunk processing

While there's a timeout for sending chunks, there's no timeout for the actual processing of chunks, which could lead to hanging workers.

Consider adding a context with timeout:

+ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+defer cancel()
 err := myaudio.ReadAudioFileBuffered(settings, func(chunk []float32) error {
     select {
     case chunkChan <- chunk:
         predStart = predStart.Add(time.Duration((3.0 - bn.Settings.BirdNET.Overlap) * float64(time.Second)))
         return nil
     case <-doneChan:
         return processingError
+    case <-ctx.Done():
+        return fmt.Errorf("context timeout: %w", ctx.Err())
     case <-time.After(5 * time.Second):
         return fmt.Errorf("timeout sending chunk to processing")
     }
 })

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +127 to +134
notes, err := bn.ProcessChunk(chunk, predStart)
if err != nil {
if settings.Debug {
fmt.Printf("DEBUG: Worker %d encountered error: %v\n", workerID, err)
}
errorChan <- err
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential deadlock in error handling

The error handling in worker goroutines could lead to a deadlock. If an error occurs, the worker exits immediately without processing remaining chunks in chunkChan.

Consider this approach:

 for chunk := range chunkChan {
     notes, err := bn.ProcessChunk(chunk, predStart)
     if err != nil {
         if settings.Debug {
             fmt.Printf("DEBUG: Worker %d encountered error: %v\n", workerID, err)
         }
         errorChan <- err
-        return
+        continue
     }

Committable suggestion skipped: line range outside the PR's diff.

internal/analysis/file.go Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant