Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable wsh file cross-remote copy/move #1725

Open
wants to merge 150 commits into
base: main
Choose a base branch
from

Conversation

esimkowitz
Copy link
Member

@esimkowitz esimkowitz commented Jan 13, 2025

This adds the ability to stream tar archives over channels between wsh instances. The main use cases for this are remote copy and move operations.

It also completes the wavefs implementation of the FileShare interface to allow copy/move interoperability between wavefiles and other storage types.

The tar streaming functionality has been broken out into the new tarcopy package for easy reuse.

New fileshare functions are added for CopyInternal, which allows copying files internal to a filesystem to bypass the expensive interop layer, and MoveInternal, which does the same for moving a file within a filesystem. Copying between remotes is now handled by CopyRemote, which accepts the source FileShareClient as a parameter. wsh connections use the same implementation for CopyInternal and CopyRemote as they need to request the channel on the remote destination, since we don't offer a way to pass channels as a parameter to a remote call.

This also adds a recursive -r flag to wsh file rm to allow for deleting a directory and all its contents.

S3 support will be addressed in a future PR.

Copy link
Contributor

coderabbitai bot commented Jan 13, 2025

Warning

Rate limit exceeded

@esimkowitz has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 1 minutes and 23 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 9265cd0 and fb0f0cb.

📒 Files selected for processing (1)
  • pkg/util/iochan/iochan.go (3 hunks)

Walkthrough

The pull request introduces several significant changes across multiple packages in the Wave Terminal project. The modifications primarily focus on enhancing file operation capabilities, particularly for file deletion and streaming. A new iochantypes package is introduced to support a more structured approach to data handling, with a Packet type that includes data and a checksum.

Key changes include adding a recursive deletion option for file removal commands, updating method signatures to use a new CommandDeleteFileData type, and modifying streaming methods to return iochantypes.Packet instead of raw byte slices. The ListBuckets function's implementation is simplified by removing pagination, and AWS-related functionality is removed from the file share package. Additionally, several methods across various components have been renamed or modified to enhance functionality and error handling. These changes span frontend and backend components, affecting file share clients, RPC interfaces, and web handling mechanisms, thereby providing more flexible and robust file management capabilities across different file system types and remote connections.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
pkg/util/fileutil/fileutil.go (1)

170-176: Add documentation for the FsFileInfo type.

The struct is well-designed and implements fs.FileInfo correctly. However, it would benefit from documentation explaining its purpose and relationship with wshrpc.FileInfo.

Add this documentation above the type declaration:

+// FsFileInfo implements fs.FileInfo interface and provides a conversion wrapper
+// for wshrpc.FileInfo to be used with standard library file operations.
 type FsFileInfo struct {
pkg/util/iochan/iochan.go (2)

Line range hint 21-57: LGTM! The SHA-256 checksum implementation enhances data integrity.

The changes improve security by adding checksum verification while maintaining good performance characteristics. The error handling is comprehensive.

Consider adding a constant for the channel buffer size (32) to make it configurable:

+const defaultChannelBufferSize = 32
 
 func ReaderChan(ctx context.Context, r io.Reader, chunkSize int64, callback func()) chan wshrpc.RespOrErrorUnion[iochantypes.Packet] {
-	ch := make(chan wshrpc.RespOrErrorUnion[iochantypes.Packet], 32)
+	ch := make(chan wshrpc.RespOrErrorUnion[iochantypes.Packet], defaultChannelBufferSize)

Line range hint 59-98: LGTM! Robust checksum verification with proper error handling.

The implementation correctly verifies data integrity before writing and properly handles cleanup.

Consider enhancing the checksum mismatch error message to include the actual vs expected checksums for easier debugging:

-						cancel(fmt.Errorf("WriterChan: checksum mismatch"))
+						cancel(fmt.Errorf("WriterChan: checksum mismatch (got: %x, expected: %x)", localChecksum, resp.Response.Checksum))
pkg/remote/fileshare/wavefs/wavefs.go (2)

103-151: Improve error handling in tar header writing.

The implementation looks good overall, with proper path cleaning, timeout handling, and streaming setup. However, consider adding error handling for the writeHeader operation to ensure proper cleanup in case of errors.

 if err = writeHeader(fileutil.ToFsFileInfo(file), file.Path); err != nil {
+    tarClose()
     rtn <- wshutil.RespErr[iochantypes.Packet](fmt.Errorf("error writing tar header: %w", err))
     return
 }

392-428: Add deferred cleanup for file operations.

While the implementation is solid, consider adding deferred cleanup for file operations to ensure proper resource management in case of errors.

 data, err := os.Open(file)
 if err != nil {
     return fmt.Errorf("error reading source blockfile: %w", err)
 }
+defer data.Close()
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8ed10f7 and 3485489.

📒 Files selected for processing (6)
  • cmd/wsh/cmd/wshcmd-file.go (3 hunks)
  • pkg/remote/fileshare/wavefs/wavefs.go (3 hunks)
  • pkg/util/fileutil/fileutil.go (2 hunks)
  • pkg/util/iochan/iochan.go (3 hunks)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
  • pkg/wshrpc/wshremote/wshremote.go (8 hunks)
🧰 Additional context used
🪛 golangci-lint (1.62.2)
pkg/util/tarcopy/tarcopy.go

45-45: S1017: should replace this if statement with an unconditional strings.TrimPrefix

(gosimple)

pkg/remote/fileshare/wavefs/wavefs.go

441-441: ineffectual assignment to err

(ineffassign)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Build for TestDriver.ai
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (10)
pkg/util/fileutil/fileutil.go (3)

16-16: LGTM! Required imports added.

The new imports are necessary for the time conversion functionality and wshrpc integration.

Also applies to: 19-19


178-200: LGTM! Complete fs.FileInfo interface implementation.

The implementation is correct and follows Go's standard library patterns. The ModTime conversion using nanosecond precision is appropriate for file timestamps.


202-212: LGTM! Clean conversion with compile-time interface verification.

The type assertion ensures interface compliance, and the conversion function correctly handles the transformation from wshrpc.FileInfo to FsFileInfo. Good practice returning a value type for immutability.

Let's verify that wshrpc.FileInfo contains all the required fields:

✅ Verification successful

✅ Verified: Field compatibility confirmed between wshrpc.FileInfo and FsFileInfo

The conversion function correctly maps all required fields with matching types and names from wshrpc.FileInfo to FsFileInfo, ensuring proper interface implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify wshrpc.FileInfo structure matches FsFileInfo fields
ast-grep --pattern 'type FileInfo struct {
  Name string
  Mode os.FileMode
  Size int64
  ModTime int64
  IsDir bool
}'

Length of output: 122


Script:

#!/bin/bash
# Find files that might contain FileInfo definition
fd -t f . | grep -i fileinfo

# Try to find the struct definition with a more flexible pattern
ast-grep --pattern 'type FileInfo struct {
  $$$
}'

Length of output: 1587

cmd/wsh/cmd/wshcmd-file.go (2)

101-101: LGTM: Recursive flag added correctly.

The recursive flag is properly added to the fileRmCmd with an appropriate description.


263-266: LGTM: Recursive flag properly integrated.

The recursive flag is correctly:

  1. Retrieved from command flags with proper error handling
  2. Passed to the FileDeleteCommand

Also applies to: 280-280

pkg/remote/fileshare/wavefs/wavefs.go (2)

478-478: LGTM: Delete method signature updated correctly.

The method signature now includes the recursive parameter, properly aligning with the interface changes for recursive deletion support.


377-389: LGTM: MoveInternal implementation is robust.

The implementation correctly:

  1. Validates host matching
  2. Sequences copy and delete operations
  3. Handles errors appropriately
  4. Propagates the recursive flag
pkg/wshrpc/wshremote/wshremote.go (3)

Line range hint 232-303: LGTM: RemoteTarStreamCommand implementation is comprehensive.

The implementation correctly:

  1. Handles path expansion and validation
  2. Manages timeouts
  3. Implements tar streaming
  4. Validates directory operations with recursive flag

Line range hint 348-439: LGTM: RemoteFileCopyCommand changes enhance robustness.

The changes add:

  1. Proper timeout handling with context
  2. Progress tracking with file counts and bytes
  3. Security checks for directory traversal
  4. Comprehensive error handling

755-775: LGTM: RemoteFileDeleteCommand properly supports recursive deletion.

The implementation correctly:

  1. Handles path expansion and cleaning
  2. Implements recursive deletion with proper validation
  3. Provides clear error messages for different scenarios

pkg/util/tarcopy/tarcopy.go Outdated Show resolved Hide resolved
pkg/util/tarcopy/tarcopy.go Outdated Show resolved Hide resolved
pkg/remote/fileshare/wavefs/wavefs.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
pkg/util/tarcopy/tarcopy.go (1)

46-49: ⚠️ Potential issue

Add path traversal check to ensure secure extraction.

Although you're using filepath.Clean(), a malicious path could still contain .. segments that traverse upward. Consider rejecting paths after cleaning if they end up outside the intended directory.

 header.Name = filepath.Clean(strings.TrimPrefix(file, pathPrefix))
+if strings.Contains(header.Name, "..") {
+    return fmt.Errorf("invalid path containing directory traversal: %s", file)
+}
 if header.Name == "" {
     return nil
 }
🧹 Nitpick comments (5)
pkg/util/tarcopy/tarcopy.go (2)

50-52: Simplify by using unconditional TrimPrefix.

Per the static analysis suggestion (S1017), you can avoid the if by unconditionally trimming the leading slash:

-if strings.HasPrefix(header.Name, "/") {
-    header.Name = header.Name[1:]
-}
+header.Name = strings.TrimPrefix(header.Name, "/")
🧰 Tools
🪛 golangci-lint (1.62.2)

50-50: S1017: should replace this if statement with an unconditional strings.TrimPrefix

(gosimple)


29-37: Eliminate repetitive close-with-retries loops via a helper function.

The code for closing pipeReader, pipeWriter, and tarWriter each repeats nearly identical retry logic. Consider extracting this into a dedicated helper to make the code shorter, more maintainable, and less error-prone.

+func closeWithRetries(closer io.Closer, label string) {
+    for retries := 0; retries < maxRetries; retries++ {
+        if err := closer.Close(); err != nil {
+            log.Printf("%s: error closing: %v, trying again in 10ms\n", label, err)
+            time.Sleep(retryDelay)
+            continue
+        }
+        break
+    }
+}

 // Example usage:
- for retries := 0; retries < maxRetries; retries++ {
-     if err := pipeReader.Close(); err != nil { ... }
- }
+closeWithRetries(pipeReader, "TarCopySrc: pipeReader")

Also applies to: 60-76, 82-91, 93-103

pkg/remote/fileshare/wavefs/wavefs.go (3)

105-158: LGTM! Comprehensive tar streaming implementation with proper error handling.

The implementation correctly handles timeouts, error cases, and both files and directories.

Consider these improvements:

  1. Preserve original file permissions instead of hard-coding to 0644
  2. Add support for symbolic links
-			file.Mode = 0644
+			// Preserve original file mode or use a default
+			if file.Mode == 0 {
+				file.Mode = 0644
+			}

388-401: Consider implementing atomic move operations.

While the implementation is correct, a failure in Delete after a successful Copy could leave the system in an inconsistent state.

Consider implementing a rollback mechanism:

 func (c WaveClient) MoveInternal(ctx context.Context, srcConn, destConn *connparse.Connection, opts *wshrpc.FileCopyOpts) error {
 	if srcConn.Host != destConn.Host {
 		return fmt.Errorf("move internal, src and dest hosts do not match")
 	}
+	// Store original state for potential rollback
+	_, originalData, err := filestore.WFS.ReadFile(ctx, srcConn.Host, srcConn.Path)
+	if err != nil {
+		return fmt.Errorf("error reading source file: %w", err)
+	}
 	err := c.CopyInternal(ctx, srcConn, destConn, opts)
 	if err != nil {
 		return fmt.Errorf("error copying blockfile: %w", err)
 	}
 	err = c.Delete(ctx, srcConn, opts.Recursive)
 	if err != nil {
+		// Rollback the copy on delete failure
+		rollbackErr := filestore.WFS.WriteFile(ctx, srcConn.Host, srcConn.Path, originalData)
+		if rollbackErr != nil {
+			return fmt.Errorf("move failed and rollback failed: delete error: %v, rollback error: %v", err, rollbackErr)
+		}
 		return fmt.Errorf("error deleting blockfile: %w", err)
 	}
 	return nil
 }

559-567: Convert magic number to named constant.

The magic number 10 should be a named constant with a clear purpose.

+const minPathPrefixLength = 10 // Minimum length for a valid path prefix (scheme://host/)

 func getPathPrefix(conn *connparse.Connection) string {
 	fullUri := conn.GetFullURI()
 	pathPrefix := fullUri
 	lastSlash := strings.LastIndex(fullUri, "/")
-	if lastSlash > 10 && lastSlash < len(fullUri)-1 {
+	if lastSlash > minPathPrefixLength && lastSlash < len(fullUri)-1 {
 		pathPrefix = fullUri[:lastSlash+1]
 	}
 	return pathPrefix
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3485489 and 0c44c24.

📒 Files selected for processing (3)
  • pkg/remote/connparse/connparse.go (1 hunks)
  • pkg/remote/fileshare/wavefs/wavefs.go (5 hunks)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
🧰 Additional context used
🪛 golangci-lint (1.62.2)
pkg/util/tarcopy/tarcopy.go

50-50: S1017: should replace this if statement with an unconditional strings.TrimPrefix

(gosimple)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build for TestDriver.ai
  • GitHub Check: Analyze (go)
🔇 Additional comments (7)
pkg/remote/connparse/connparse.go (1)

60-62: LGTM with suggestions for robustness.

The implementation is clean and consistent with other URI-related methods in the struct. However, consider adding validation for empty fields to prevent malformed URIs.

Consider this more robust implementation:

 func (c *Connection) GetSchemeAndHost() string {
+    if c.Scheme == "" || c.Host == "" {
+        return ""
+    }
     return c.Scheme + "://" + c.Host
 }

Let's verify the usage context to ensure empty values won't cause issues:

pkg/util/tarcopy/tarcopy.go (1)

105-127: No specific issues found in the tar read loop.

Overall logic is sound: context checks, EOF detection, and error returns are handled properly.

pkg/remote/fileshare/wavefs/wavefs.go (5)

7-7: LGTM! Import changes align with tar streaming functionality.

The new imports support the implementation of tar streaming for remote file operations.

Also applies to: 12-12, 14-14, 22-24


178-185: LGTM! Improved logging and nil safety.

Added debug logging and proper nil handling for opts parameter.


385-385: LGTM! Better error handling for unsupported operation.

Explicitly returning errors.ErrUnsupported instead of nil improves error handling clarity.


441-494: ⚠️ Potential issue

Fix ineffectual assignment to err variable.

The error from cleanPath is assigned but not checked.

Apply this fix:

-        fileName, err := cleanPath(path.Join(destPrefix, next.Name))
+        fileName, _ := cleanPath(path.Join(destPrefix, next.Name))

Likely invalid or redundant comment.


403-439: Address potential race condition in file creation.

The gap between checking file existence and creating it could lead to a race condition in concurrent operations.

Consider using atomic operations or proper locking mechanisms. Also, verify if the underlying filestore.WFS provides atomic guarantees.

pkg/remote/fileshare/wavefs/wavefs.go Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (5)
pkg/util/tarcopy/tarcopy.go (2)

71-108: Path bounding in TarCopyDest function.

While TarCopyDest focuses on reading tar entries, it processes each header without verifying whether the extracted file paths remain within the intended directory structure. In a multi-tenant or security-sensitive environment, malicious tar headers could overwrite unintended files.

Ensure you have an optional validation step to confirm that the final paths for extracted files remain restricted to the desired target directory.


110-124: Capture final error on closure failure.

gracefulClose logs each close attempt’s error, but if all attempts fail, the final message doesn’t include the last error details. In complex workflows, losing this error context can complicate debugging.

Consider tracking and logging the last encountered error in the final log.Printf statement to provide more comprehensive diagnostic information.

pkg/remote/fileshare/wavefs/wavefs.go (3)

105-124: Short deadline may cause unexpected cancellation.

A default timeout of 100ms (line 115) might be too short for large file listings or slow networks. If large directories are encountered, this can result in partial tar data.

Consider increasing the default timeout or making it clearly configurable to avoid premature cancellation.


178-185: Enhanced logging for ListEntries.

Line 178 logs connection details, which can be beneficial for debugging. Ensure sensitive or private information is not leaked in logs.


388-399: Partial failures in MoveInternal.

If CopyInternal succeeds but Delete fails, the operation leaves a partially moved file. Depending on your needs, consider an all-or-nothing approach or an explicit rollback.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f149473 and 9dc8ebf.

📒 Files selected for processing (2)
  • pkg/remote/fileshare/wavefs/wavefs.go (5 hunks)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
🧰 Additional context used
🪛 golangci-lint (1.62.2)
pkg/util/tarcopy/tarcopy.go

56-56: S1017: should replace this if statement with an unconditional strings.TrimPrefix

(gosimple)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build for TestDriver.ai
  • GitHub Check: Analyze (go)
🔇 Additional comments (6)
pkg/util/tarcopy/tarcopy.go (1)

1-33: Proper package documentation & import usage.

The overall package documentation is clear, and imports are well-structured. No immediate issues are spotted within these lines.

pkg/remote/fileshare/wavefs/wavefs.go (5)

7-24: Imports adjusted for tar-based copying.

The newly added import statements (e.g., "archive/tar", "github.com/wavetermdev/waveterm/pkg/util/tarcopy") appear appropriate for the cross-remote copy/move feature described in the PR objectives. No immediate concerns.


127-157: Directory handling in ReadTarStream.

When encountering directories (line 139), the loop simply continues without recursion. This is likely intentional but consider clarifying with a comment if directory recursion is out of scope.


385-385: Unsupported Mkdir.

Line 385 returns errors.ErrUnsupported, ensuring clarity that wavefs does not support directories. This is consistent with the overall wavefile approach.


403-439: CopyInternal only supports same-host transfers.

Line 437 raises an error if hosts differ. This is consistent with wavefs’s design, but confirm the higher-level caller gracefully handles this limitation.


569-577: getPathPrefix comprehension.

getPathPrefix attempts to preserve everything up to the last slash. Verify that this logic aligns with how wavefs is intended to locate files, especially for edge cases where the path may be just '/' or contain trailing slashes.

pkg/util/tarcopy/tarcopy.go Outdated Show resolved Hide resolved
pkg/util/tarcopy/tarcopy.go Outdated Show resolved Hide resolved
pkg/remote/fileshare/wavefs/wavefs.go Show resolved Hide resolved
pkg/remote/fileshare/wavefs/wavefs.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (14)
pkg/util/fileutil/fileutil.go (2)

168-198: Add documentation for the FsFileInfo type and its methods.

The implementation correctly satisfies the fs.FileInfo interface, but would benefit from documentation explaining its purpose and usage.

Add documentation comments:

+// FsFileInfo implements fs.FileInfo interface for remote file information
 type FsFileInfo struct {
     NameInternal    string
     ModeInternal    os.FileMode
     SizeInternal    int64
     ModTimeInternal int64
     IsDirInternal   bool
 }

+// Name returns the base name of the file
 func (f FsFileInfo) Name() string {
     return f.NameInternal
 }

202-210: Add nil check in ToFsFileInfo.

The conversion logic is correct, but should handle nil input gracefully.

Add nil check:

 func ToFsFileInfo(fi *wshrpc.FileInfo) FsFileInfo {
+    if fi == nil {
+        return FsFileInfo{}
+    }
     return FsFileInfo{
         NameInternal:    fi.Name,
         ModeInternal:    fi.Mode,
         SizeInternal:    fi.Size,
         ModTimeInternal: fi.ModTime,
         IsDirInternal:   fi.IsDir,
     }
 }
pkg/util/tarcopy/tarcopy.go (3)

52-55: Enhance path validation for better security.

While the current implementation checks for directory traversal after cleaning the path, consider validating the path before cleaning to prevent any potential normalization bypasses.

-header.Name = filepath.Clean(strings.TrimPrefix(file, pathPrefix))
-if err := validatePath(header.Name); err != nil {
+trimmedPath := strings.TrimPrefix(file, pathPrefix)
+if err := validatePath(trimmedPath); err != nil {
+  return err
+}
+header.Name = filepath.Clean(trimmedPath)
+if err := validatePath(header.Name); err != nil {
   return err
 }

68-76: Strengthen path validation for cross-platform security.

Consider adding checks for:

  1. Windows-style paths (e.g., C:\)
  2. Special characters that might be interpreted differently across platforms
  3. Null bytes that could lead to string truncation
 func validatePath(path string) error {
+    if strings.ContainsRune(path, '\x00') {
+        return fmt.Errorf("invalid path containing null byte: %s", path)
+    }
+    if strings.ContainsAny(path, "\\:*?\"<>|") {
+        return fmt.Errorf("invalid path containing special characters: %s", path)
+    }
     if strings.Contains(path, "..") {
         return fmt.Errorf("invalid path containing directory traversal: %s", path)
     }
     if strings.HasPrefix(path, "/") {
         return fmt.Errorf("invalid path starting with /: %s", path)
     }
+    if len(path) >= 3 && path[1] == ':' && path[2] == '\\' {
+        return fmt.Errorf("invalid Windows-style absolute path: %s", path)
+    }
     return nil
 }

94-116: Optimize the loop structure for better readability.

The current loop structure with a select statement can be simplified since the context check is only needed before reading the next tar entry.

-    for {
-        select {
-        case <-ctx.Done():
-            return nil
-        default:
-            next, err := tarReader.Next()
-            if err != nil {
-                // Do one more check for context error before returning
-                if ctx.Err() != nil {
-                    return nil
-                }
-                if errors.Is(err, io.EOF) {
-                    return nil
-                } else {
-                    return fmt.Errorf("cannot read tar stream: %w", err)
-                }
-            }
-            err = readNext(next, tarReader)
-            if err != nil {
-                return err
-            }
-        }
-    }
+    for ctx.Err() == nil {
+        next, err := tarReader.Next()
+        if err != nil {
+            if errors.Is(err, io.EOF) {
+                return nil
+            }
+            return fmt.Errorf("cannot read tar stream: %w", err)
+        }
+        if err := readNext(next, tarReader); err != nil {
+            return err
+        }
+    }
+    return nil
pkg/remote/fileshare/wavefs/wavefs.go (5)

105-158: Consider increasing the default timeout value.

The default timeout of 100ms might be too short for large files or slow network conditions. Consider increasing it to a more reasonable default, like 5 seconds (5000ms).

-	timeout := time.Millisecond * 100
+	timeout := time.Second * 5

388-399: Consider implementing atomic move operation.

The current implementation could leave the system in an inconsistent state if the delete operation fails after a successful copy. Consider implementing a transaction-like approach or adding cleanup on failure.


403-439: Consider using buffered reads for large files.

For large files, consider using a buffer pool or streaming approach to reduce memory usage during the copy operation.

-		_, dataBuf, err := filestore.WFS.ReadFile(ctx, host, srcFileName)
+		const bufferSize = 32 * 1024 // 32KB buffer
+		buffer := make([]byte, bufferSize)
+		reader, _, err := filestore.WFS.ReadStream(ctx, host, srcFileName)

498-533: Improve error aggregation in recursive deletion.

Consider using errors.Join (Go 1.20+) for better error aggregation and formatting:

-			return fmt.Errorf("error deleting blockfiles: %v", errs)
+			return fmt.Errorf("error deleting blockfiles: %w", errors.Join(errs...))

574-582: Replace magic number with named constant.

The magic number 10 in the path length check should be replaced with a named constant for better maintainability and clarity.

+const minPathLength = 10 // minimum length for scheme://host/
 func getPathPrefix(conn *connparse.Connection) string {
 	fullUri := conn.GetFullURI()
 	pathPrefix := fullUri
 	lastSlash := strings.LastIndex(fullUri, "/")
-	if lastSlash > 10 && lastSlash < len(fullUri)-1 {
+	if lastSlash > minPathLength && lastSlash < len(fullUri)-1 {
 		pathPrefix = fullUri[:lastSlash+1]
 	}
 	return pathPrefix
 }
pkg/wshrpc/wshremote/wshremote.go (4)

262-265: Consider extracting the default timeout value into a constant.

The default timeout of 100ms is hardcoded. Consider extracting this into a named constant for better maintainability.

+const defaultTarStreamTimeout = 100 * time.Millisecond

 func (impl *ServerImpl) RemoteTarStreamCommand(ctx context.Context, data wshrpc.CommandRemoteStreamTarData) <-chan wshrpc.RespOrErrorUnion[iochantypes.Packet] {
-    timeout := time.Millisecond * 100
+    timeout := defaultTarStreamTimeout
     if opts.Timeout > 0 {
         timeout = time.Duration(opts.Timeout) * time.Millisecond
     }

290-293: Consider using io.CopyBuffer for potential performance improvement.

When copying file contents, using io.CopyBuffer with a pre-allocated buffer might improve performance for large files.

+    buffer := make([]byte, 32*1024)
-    if _, err := io.Copy(fileWriter, data); err != nil {
+    if _, err := io.CopyBuffer(fileWriter, data, buffer); err != nil {
         log.Printf("error copying file %q: %v\n", file, err)
         return err
     }

Line range hint 356-438: Consider breaking down the large copy operation into smaller functions.

The file copy implementation is quite long and handles multiple concerns. Consider extracting the tar copy logic into separate functions for better maintainability.

+func (impl *ServerImpl) handleTarCopy(ctx context.Context, destPathCleaned string, srcUri string, destUri string, opts *wshrpc.FileCopyOpts) error {
+    timeout := time.Millisecond * 100
+    if opts.Timeout > 0 {
+        timeout = time.Duration(opts.Timeout) * time.Millisecond
+    }
+    readCtx, cancel := context.WithCancelCause(ctx)
+    readCtx, timeoutCancel := context.WithTimeoutCause(readCtx, timeout, fmt.Errorf("timeout copying file %q to %q", srcUri, destUri))
+    defer timeoutCancel()
+    
+    copyStart := time.Now()
+    ioch := wshclient.FileStreamTarCommand(wshfs.RpcClient, wshrpc.CommandRemoteStreamTarData{Path: srcUri, Opts: opts}, &wshrpc.RpcOpts{Timeout: opts.Timeout})
+    
+    stats := &copyStats{}
+    err := tarcopy.TarCopyDest(readCtx, cancel, ioch, impl.processTarEntry(destPathCleaned, stats, opts))
+    if err != nil {
+        return fmt.Errorf("cannot copy %q to %q: %w", srcUri, destUri, err)
+    }
+    
+    log.Printf("RemoteFileCopyCommand: done; %d files copied in %dms, total of %d bytes, %d files skipped\n",
+        stats.numFiles, time.Since(copyStart).Milliseconds(), stats.totalBytes, stats.numSkipped)
+    return nil
+}

755-775: Consider improving error handling for better clarity.

The error handling could be more structured and provide clearer messages.

 func (*ServerImpl) RemoteFileDeleteCommand(ctx context.Context, data wshrpc.CommandDeleteFileData) error {
     expandedPath, err := wavebase.ExpandHomeDir(data.Path)
     if err != nil {
-        return fmt.Errorf("cannot delete file %q: %w", data.Path, err)
+        return fmt.Errorf("cannot expand path %q: %w", data.Path, err)
     }
     cleanedPath := filepath.Clean(expandedPath)
 
-    err = os.Remove(cleanedPath)
+    finfo, err := os.Stat(cleanedPath)
     if err != nil {
-        finfo, _ := os.Stat(cleanedPath)
-        if finfo != nil && finfo.IsDir() {
-            if !data.Recursive {
-                return fmt.Errorf("cannot delete directory %q, recursive option not specified", data.Path)
-            }
-            err = os.RemoveAll(cleanedPath)
-            if err != nil {
-                return fmt.Errorf("cannot delete directory %q: %w", data.Path, err)
-            }
-        } else {
-            return fmt.Errorf("cannot delete file %q: %w", data.Path, err)
+        if os.IsNotExist(err) {
+            return fmt.Errorf("path %q does not exist", data.Path)
         }
+        return fmt.Errorf("cannot access path %q: %w", data.Path, err)
+    }
+
+    if finfo.IsDir() {
+        if !data.Recursive {
+            return fmt.Errorf("cannot delete directory %q: recursive option not specified", data.Path)
+        }
+        if err := os.RemoveAll(cleanedPath); err != nil {
+            return fmt.Errorf("cannot delete directory %q: %w", data.Path, err)
+        }
+        return nil
+    }
+
+    if err := os.Remove(cleanedPath); err != nil {
+        return fmt.Errorf("cannot delete file %q: %w", data.Path, err)
     }
     return nil
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dc8ebf and 7121868.

📒 Files selected for processing (4)
  • pkg/remote/fileshare/wavefs/wavefs.go (5 hunks)
  • pkg/util/fileutil/fileutil.go (2 hunks)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
  • pkg/wshrpc/wshremote/wshremote.go (8 hunks)
🧰 Additional context used
📓 Learnings (1)
pkg/remote/fileshare/wavefs/wavefs.go (1)
Learnt from: esimkowitz
PR: wavetermdev/waveterm#1725
File: pkg/remote/fileshare/wavefs/wavefs.go:441-494
Timestamp: 2025-01-29T04:21:11.649Z
Learning: The `CopyRemote` function in WaveTerm's file operations has proper error handling that prevents partial writes by failing fast and using context cancellation. Each step (path cleaning, file operations, tar reading/writing) is guarded by error checks that prevent proceeding with writes on error.
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build for TestDriver.ai
🔇 Additional comments (12)
pkg/util/fileutil/fileutil.go (3)

15-15: LGTM! Required imports added for new functionality.

The new imports are necessary for timestamp handling and RPC type conversion.

Also applies to: 18-18


22-22: LGTM! Improved error handling in FixPath.

The error variable is now properly scoped, and the path normalization logic remains correct.

Also applies to: 26-26


200-200: LGTM! Good practice using type assertion.

The type assertion ensures the struct properly implements the fs.FileInfo interface at compile time.

pkg/util/tarcopy/tarcopy.go (2)

1-32: LGTM! Well-structured package setup.

The package declaration, imports, and constants are well-organized. The retry parameters and debug names are clearly defined and follow Go conventions.


119-134: LGTM! Well-implemented resource cleanup.

The gracefulClose function provides robust error handling with:

  • Proper retry logic with configurable parameters
  • Informative logging
  • Clear success/failure indication
pkg/remote/fileshare/wavefs/wavefs.go (4)

7-24: LGTM! Well-organized imports.

The new imports are properly organized and necessary for implementing the tar streaming and file operations functionality.


178-185: LGTM! Improved logging and nil safety.

The added logging and nil check for opts enhance debugging capabilities and prevent potential nil pointer dereferences.


385-385: LGTM! Better error handling.

Returning errors.ErrUnsupported is more explicit and helps clients handle the unsupported operation correctly.


441-494: LGTM! Robust implementation with proper error handling.

The implementation includes proper error handling, context cancellation, and prevents partial writes by failing fast. The tar streaming approach is well-suited for remote copy operations.

pkg/wshrpc/wshremote/wshremote.go (3)

21-28: LGTM! Import changes align with new functionality.

The new imports support the tar streaming functionality and enhanced file operations.


670-670: LGTM! Clear error message for cross-host move operations.

The error message clearly indicates that moving files between different hosts is not supported.


274-277: 🛠️ Refactor suggestion

Improve error handling in filepath.Walk.

The error handling for the Walk function could be more robust by checking the error before proceeding.

 err := filepath.Walk(path, func(file string, fi os.FileInfo, err error) error {
     if readerCtx.Err() != nil {
         return readerCtx.Err()
     }
+    if err != nil {
+        return fmt.Errorf("error accessing %q: %w", file, err)
+    }
-    if err != nil {
-        return err
-    }

Likely invalid or redundant comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
pkg/wshrpc/wshremote/wshremote.go (3)

297-304: Improve error handling in the goroutine.

The error handling could be improved by checking for context cancellation before sending the error.

Apply this diff to enhance error handling:

-       err = nil
-       if finfo.IsDir() {
-           err = filepath.Walk(path, walkFunc)
-       } else {
-           err = walkFunc(path, finfo, nil)
-       }
-       if err != nil {
-           rtn <- wshutil.RespErr[iochantypes.Packet](err)
-       }
+       err = nil
+       if finfo.IsDir() {
+           err = filepath.Walk(path, walkFunc)
+       } else {
+           err = walkFunc(path, finfo, nil)
+       }
+       if err != nil && readerCtx.Err() == nil {
+           rtn <- wshutil.RespErr[iochantypes.Packet](err)
+       }

428-436: Add deferred close for file operations.

File operations should use deferred close to ensure proper cleanup.

Apply this diff:

     file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
     if err != nil {
         return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
     }
+    defer file.Close()
     _, err = io.Copy(file, reader)
     if err != nil {
         return fmt.Errorf("cannot write file %q: %w", nextPath, err)
     }
-    file.Close()

777-779: Enhance error message for recursive deletion.

The error message could be more descriptive to guide users on how to enable recursive deletion.

Apply this diff:

-               return fmt.Errorf("cannot delete directory %q, recursive option not specified", data.Path)
+               return fmt.Errorf("cannot delete directory %q: directory not empty, use -r flag for recursive deletion", data.Path)
pkg/util/tarcopy/tarcopy.go (5)

4-5: Enhance package documentation.

Consider expanding the package documentation to include:

  • Security considerations when handling tar streams
  • Example usage patterns
  • Relationship with other packages like iochan and wshrpc

38-38: Consider adding chunkSize parameter for flexibility.

The function uses a hardcoded wshrpc.FileChunkSize but could benefit from a configurable chunk size parameter for different use cases.


52-55: Strengthen path validation.

While validatePath checks for basic path traversal, consider additional checks:

  • Normalize path separators
  • Check for null bytes
  • Validate against a maximum path length
-			header.Name = filepath.Clean(strings.TrimPrefix(file, pathPrefix))
-			if err := validatePath(header.Name); err != nil {
+			normalizedPath := filepath.ToSlash(filepath.Clean(strings.TrimPrefix(file, pathPrefix)))
+			if err := validatePath(normalizedPath); err != nil {
+				return err
+			}
+			if len(normalizedPath) > 256 {
+				return fmt.Errorf("path too long: %s", normalizedPath)
+			}
+			if strings.Contains(normalizedPath, "\x00") {
+				return fmt.Errorf("path contains null byte: %s", normalizedPath)
+			}
+			header.Name = normalizedPath

68-76: Enhance path validation with additional checks.

Consider adding checks for:

  • Empty paths
  • Windows-style paths (e.g., C:\)
  • Unicode normalization
  • Maximum path segments
 func validatePath(path string) error {
+	if path == "" {
+		return fmt.Errorf("empty path")
+	}
 	if strings.Contains(path, "..") {
 		return fmt.Errorf("invalid tar path containing directory traversal: %s", path)
 	}
 	if strings.HasPrefix(path, "/") {
 		return fmt.Errorf("invalid tar path starting with /: %s", path)
 	}
+	if strings.Contains(path, ":") {
+		return fmt.Errorf("invalid tar path containing drive letter: %s", path)
+	}
+	if len(strings.Split(path, "/")) > 100 {
+		return fmt.Errorf("too many path segments: %s", path)
+	}
 	return nil
 }

112-114: Improve error message for tar stream reading failures.

Add more context to the error return.

-					return err
+					return fmt.Errorf("failed to read tar stream: %w", err)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7121868 and eadc3dd.

📒 Files selected for processing (3)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
  • pkg/wshrpc/wshremote/wshremote.go (7 hunks)
  • pkg/wshrpc/wshrpctypes.go (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/wshrpc/wshrpctypes.go
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build for TestDriver.ai
🔇 Additional comments (5)
pkg/wshrpc/wshremote/wshremote.go (3)

21-28: LGTM! Import changes are appropriate.

The new imports support the tar streaming functionality and enhanced file operations.


262-267: LGTM! Timeout handling is well-implemented.

The timeout configuration with fallback to default value and proper context management is correct.


274-282: ⚠️ Potential issue

Fix error variable shadowing in walkFunc.

The error variable is shadowed in the walkFunc, which could mask errors from filepath.Walk.

Apply this diff to fix the error handling:

-           if err = writeHeader(info, path); err != nil {
+           if writeErr := writeHeader(info, path); writeErr != nil {
-               return err
+               return writeErr
            }

Likely invalid or redundant comment.

pkg/util/tarcopy/tarcopy.go (2)

24-32: LGTM! Well-organized constants.

The constants are well-named and their values are appropriate for their purposes.


124-139: LGTM! Robust resource cleanup implementation.

The gracefulClose function has:

  • Well-implemented retry logic
  • Good error logging
  • Appropriate use of constants

pkg/util/tarcopy/tarcopy.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (11)
pkg/remote/connparse/connparse_test.go (2)

193-215: Add t.Parallel() for consistent test parallelization.

The test follows the established pattern and correctly validates Unix-style relative paths. However, it's missing the t.Parallel() call that's present in other test functions.

 func TestParseURI_WSHCurrentPath(t *testing.T) {
+    t.Parallel()
     cstr := "./Documents/path/to/file"

217-239: Add t.Parallel() for consistent test parallelization.

The test correctly handles Windows-style path separators which is crucial for cross-platform compatibility. However, it's missing the t.Parallel() call that's present in other test functions.

 func TestParseURI_WSHCurrentPathWindows(t *testing.T) {
+    t.Parallel()
     cstr := ".\\Documents\\path\\to\\file"
pkg/remote/fileshare/wavefs/wavefs.go (3)

109-162: LGTM! Well-implemented tar streaming with proper error handling.

The implementation includes:

  • Proper context management with timeouts
  • Detailed logging
  • Error handling at each step

Consider making file mode configurable.

The file mode is currently hardcoded to 0644 (line 136). Consider making this configurable through the options parameter.

-			file.Mode = 0644
+			if opts.FileMode != 0 {
+				file.Mode = opts.FileMode
+			} else {
+				file.Mode = 0644
+			}

502-537: Document partial failure behavior in recursive deletion.

The implementation collects errors during recursive deletion but continues with remaining files. Consider documenting this behavior in the function comment to make it clear that:

  1. The operation is not atomic
  2. Some files may be deleted even if the operation returns an error
+// Delete removes a file or directory. When recursive is true, it attempts to delete
+// all entries under the given path. The operation is not atomic - if an error occurs
+// during recursive deletion, some files may have been deleted while others remain.
 func (c WaveClient) Delete(ctx context.Context, conn *connparse.Connection, recursive bool) error {

578-586: Add validation in getPathPrefix.

Consider adding validation for the connection parameter and documenting the function's behavior.

+// getPathPrefix returns the directory prefix for the given connection.
+// For a file path, it returns the directory containing the file.
+// For a directory path, it returns the directory path with a trailing slash.
 func getPathPrefix(conn *connparse.Connection) string {
+	if conn == nil {
+		return ""
+	}
 	fullUri := conn.GetFullURI()
 	pathPrefix := fullUri
 	lastSlash := strings.LastIndex(fullUri, "/")
pkg/util/tarcopy/tarcopy.go (3)

38-38: Consider adding error return for potential pipe creation failures.

The function signature could be improved by returning an error to handle potential failures in pipe creation or tar writer initialization.

-func TarCopySrc(ctx context.Context, pathPrefix string) (outputChan chan wshrpc.RespOrErrorUnion[iochantypes.Packet], writeHeader func(fi fs.FileInfo, file string) error, writer io.Writer, close func()) {
+func TarCopySrc(ctx context.Context, pathPrefix string) (outputChan chan wshrpc.RespOrErrorUnion[iochantypes.Packet], writeHeader func(fi fs.FileInfo, file string) error, writer io.Writer, close func(), err error) {

68-76: Consider additional path validations.

While the current validation is good, consider adding checks for:

  1. Empty paths
  2. Windows-style absolute paths (e.g., C:\)
 func validatePath(path string) error {
+    if path == "" {
+        return fmt.Errorf("empty tar path")
+    }
     if strings.Contains(path, "..") {
         return fmt.Errorf("invalid tar path containing directory traversal: %s", path)
     }
     if strings.HasPrefix(path, "/") {
         return fmt.Errorf("invalid tar path starting with /: %s", path)
     }
+    if filepath.VolumeName(path) != "" {
+        return fmt.Errorf("invalid tar path with volume name: %s", path)
+    }
     return nil
 }

103-113: Improve error specificity in tar reading.

The error handling could be more specific about the type of error encountered during tar reading.

             if err != nil {
                 // Do one more check for context error before returning
                 if ctx.Err() != nil {
                     return context.Cause(ctx)
                 }
                 if errors.Is(err, io.EOF) {
                     return nil
                 } else {
-                    return err
+                    return fmt.Errorf("failed to read tar entry: %w", err)
                 }
             }
pkg/wshrpc/wshremote/wshremote.go (1)

777-791: Optimize error handling in delete operation.

The current implementation makes an unnecessary os.Stat call after os.Remove fails. We can improve this by checking if the error indicates that the path is a directory.

 err = os.Remove(cleanedPath)
 if err != nil {
-  finfo, _ := os.Stat(cleanedPath)
-  if finfo != nil && finfo.IsDir() {
+  if errors.Is(err, syscall.EISDIR) {
     if !data.Recursive {
       return fmt.Errorf("cannot delete directory %q, recursive option not specified", data.Path)
     }
     err = os.RemoveAll(cleanedPath)
     if err != nil {
       return fmt.Errorf("cannot delete directory %q: %w", data.Path, err)
     }
   } else {
     return fmt.Errorf("cannot delete file %q: %w", data.Path, err)
   }

Note: Add "syscall" to the imports.

pkg/util/fileutil/fileutil.go (2)

Line range hint 23-33: Consider using debug level for path-related logs.

The logging statements are helpful for debugging but might be too verbose for production. Consider using a debug log level or making the logging configurable.

-		log.Printf("FixPath: path is not absolute: %s", path)
+		log.Debug().Msgf("FixPath: path is not absolute: %s", path)
-		log.Printf("FixPath: fixed path: %s", path)
+		log.Debug().Msgf("FixPath: fixed path: %s", path)

171-201: Add documentation for the FsFileInfo type and its methods.

While the implementation is correct, adding documentation would improve maintainability. Consider documenting:

  1. The purpose of the type
  2. The meaning of the "Internal" suffix in field names
  3. Why Sys() returns nil
+// FsFileInfo implements fs.FileInfo interface to provide file information
+// for remote file operations. Field names are suffixed with "Internal" to
+// avoid confusion with method names.
 type FsFileInfo struct {
     NameInternal    string
     ModeInternal    os.FileMode
     SizeInternal    int64
     ModTimeInternal int64
     IsDirInternal   bool
 }

+// Name returns the base name of the file
 func (f FsFileInfo) Name() string {
     return f.NameInternal
 }

+// Size returns the length in bytes for regular files
 func (f FsFileInfo) Size() int64 {
     return f.SizeInternal
 }

+// Mode returns the file mode bits
 func (f FsFileInfo) Mode() os.FileMode {
     return f.ModeInternal
 }

+// ModTime returns the modification time
 func (f FsFileInfo) ModTime() time.Time {
     return time.Unix(0, f.ModTimeInternal)
 }

+// IsDir returns whether the file is a directory
 func (f FsFileInfo) IsDir() bool {
     return f.IsDirInternal
 }

+// Sys returns nil as underlying data source is not available
 func (f FsFileInfo) Sys() interface{} {
     return nil
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eadc3dd and 3b28f01.

📒 Files selected for processing (6)
  • pkg/remote/connparse/connparse.go (2 hunks)
  • pkg/remote/connparse/connparse_test.go (1 hunks)
  • pkg/remote/fileshare/wavefs/wavefs.go (5 hunks)
  • pkg/util/fileutil/fileutil.go (2 hunks)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
  • pkg/wshrpc/wshremote/wshremote.go (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/remote/connparse/connparse.go
🧰 Additional context used
📓 Learnings (1)
pkg/remote/fileshare/wavefs/wavefs.go (1)
Learnt from: esimkowitz
PR: wavetermdev/waveterm#1725
File: pkg/remote/fileshare/wavefs/wavefs.go:441-494
Timestamp: 2025-01-29T04:21:11.649Z
Learning: The `CopyRemote` function in WaveTerm's file operations has proper error handling that prevents partial writes by failing fast and using context cancellation. Each step (path cleaning, file operations, tar reading/writing) is guarded by error checks that prevent proceeding with writes on error.
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build for TestDriver.ai
🔇 Additional comments (13)
pkg/remote/connparse/connparse_test.go (1)

193-239: LGTM! Well-structured tests for cross-platform path handling.

The new tests comprehensively validate URI parsing for both Unix and Windows relative paths, which is essential for the PR's objective of enabling cross-remote copy/move operations. The tests maintain consistent behavior with the existing test suite while properly handling platform-specific path formats.

pkg/remote/fileshare/wavefs/wavefs.go (5)

7-34: LGTM! Well-structured imports and timeout constant.

The new imports support the tar streaming functionality, and the DefaultTimeout constant provides a good default value for operations.


182-189: LGTM! Improved robustness with nil checks.

Good addition of logging and nil checks for the opts parameter.


389-389: LGTM! Clear indication of unsupported operation.

Good change to return errors.ErrUnsupported instead of nil.


392-443: LGTM! Well-structured internal move and copy operations.

Good separation of concerns between internal and remote operations. Proper error handling and host validation.


445-498: LGTM! Robust implementation of remote copy with proper error handling.

The implementation includes:

  • Context cancellation for proper cleanup
  • Detailed logging
  • Proper error handling at each step
  • Event publishing for file system changes
pkg/util/tarcopy/tarcopy.go (4)

1-32: Well-structured package with clear organization!

The package follows Go best practices with proper license headers, organized imports, and well-named constants.


52-55: Robust path validation implementation!

The combination of filepath.Clean, strings.TrimPrefix, and validatePath provides strong protection against path traversal attacks.


97-99: Remove duplicate context error check.

The context error check is duplicated. The second check at lines 105-107 is redundant.

Also applies to: 105-107


122-137: Well-implemented resource cleanup with retry logic!

The gracefulClose function provides robust resource cleanup with:

  • Clear retry logic with configurable limits
  • Informative debug logging
  • Proper status reporting
pkg/wshrpc/wshremote/wshremote.go (2)

686-686: LGTM!

The error message for cross-host move operations is clear and appropriate.


278-299: ⚠️ Potential issue

Fix error variable shadowing and improve error handling in walkFunc.

The error variable is shadowed in the walkFunc, which could mask errors from filepath.Walk. Also, the error handling can be improved.

-walkFunc := func(path string, info fs.FileInfo, err error) error {
-  if readerCtx.Err() != nil {
-    return readerCtx.Err()
-  }
-  if err != nil {
-    return err
-  }
-  if err = writeHeader(info, path); err != nil {
+walkFunc := func(path string, info fs.FileInfo, walkErr error) error {
+  if readerCtx.Err() != nil {
+    return readerCtx.Err()
+  }
+  if walkErr != nil {
+    return fmt.Errorf("walk error at %q: %w", path, walkErr)
+  }
+  if err := writeHeader(info, path); err != nil {

Likely invalid or redundant comment.

pkg/util/fileutil/fileutil.go (1)

16-16: LGTM!

The new imports are properly organized and necessary for the added functionality.

Also applies to: 19-19

Comment on lines +432 to 440
file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
if err != nil {
return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
}
_, err = io.Copy(file, tarReader)
_, err = io.Copy(file, reader)
if err != nil {
return fmt.Errorf("cannot write file %q: %w", nextPath, err)
}
file.Chmod(finfo.Mode())
file.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve file handling with deferred close.

The file.Close() call should be deferred immediately after the file is opened to prevent resource leaks in case of errors.

 file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
 if err != nil {
   return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
 }
+defer file.Close()
 _, err = io.Copy(file, reader)
 if err != nil {
   return fmt.Errorf("cannot write file %q: %w", nextPath, err)
 }
-file.Close()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
if err != nil {
return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
}
_, err = io.Copy(file, tarReader)
_, err = io.Copy(file, reader)
if err != nil {
return fmt.Errorf("cannot write file %q: %w", nextPath, err)
}
file.Chmod(finfo.Mode())
file.Close()
file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
if err != nil {
return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
}
defer file.Close()
_, err = io.Copy(file, reader)
if err != nil {
return fmt.Errorf("cannot write file %q: %w", nextPath, err)
}

Comment on lines +270 to +271
readerCtx, cancel := context.WithTimeout(context.Background(), timeout)
rtn, writeHeader, fileWriter, tarClose := tarcopy.TarCopySrc(readerCtx, pathPrefix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix context chain for proper cancellation.

The timeout context should be created from the parent context to ensure proper cancellation chain.

-readerCtx, cancel := context.WithTimeout(context.Background(), timeout)
+readerCtx, cancel := context.WithTimeout(ctx, timeout)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
readerCtx, cancel := context.WithTimeout(context.Background(), timeout)
rtn, writeHeader, fileWriter, tarClose := tarcopy.TarCopySrc(readerCtx, pathPrefix)
readerCtx, cancel := context.WithTimeout(ctx, timeout)
rtn, writeHeader, fileWriter, tarClose := tarcopy.TarCopySrc(readerCtx, pathPrefix)

Comment on lines +205 to +213
func ToFsFileInfo(fi *wshrpc.FileInfo) FsFileInfo {
return FsFileInfo{
NameInternal: fi.Name,
ModeInternal: fi.Mode,
SizeInternal: fi.Size,
ModTimeInternal: fi.ModTime,
IsDirInternal: fi.IsDir,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add nil check and documentation for ToFsFileInfo.

The conversion function should validate its input and be documented.

+// ToFsFileInfo converts wshrpc.FileInfo to FsFileInfo.
+// It panics if fi is nil.
 func ToFsFileInfo(fi *wshrpc.FileInfo) FsFileInfo {
+    if fi == nil {
+        panic("ToFsFileInfo: nil FileInfo")
+    }
     return FsFileInfo{
         NameInternal:    fi.Name,
         ModeInternal:    fi.Mode,
         SizeInternal:    fi.Size,
         ModTimeInternal: fi.ModTime,
         IsDirInternal:   fi.IsDir,
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func ToFsFileInfo(fi *wshrpc.FileInfo) FsFileInfo {
return FsFileInfo{
NameInternal: fi.Name,
ModeInternal: fi.Mode,
SizeInternal: fi.Size,
ModTimeInternal: fi.ModTime,
IsDirInternal: fi.IsDir,
}
}
// ToFsFileInfo converts wshrpc.FileInfo to FsFileInfo.
// It panics if fi is nil.
func ToFsFileInfo(fi *wshrpc.FileInfo) FsFileInfo {
if fi == nil {
panic("ToFsFileInfo: nil FileInfo")
}
return FsFileInfo{
NameInternal: fi.Name,
ModeInternal: fi.Mode,
SizeInternal: fi.Size,
ModTimeInternal: fi.ModTime,
IsDirInternal: fi.IsDir,
}
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
pkg/util/fileutil/fileutil.go (1)

202-210: 🛠️ Refactor suggestion

Add nil check and documentation for ToFsFileInfo.

The conversion function should validate its input and be documented.

+// ToFsFileInfo converts wshrpc.FileInfo to FsFileInfo.
+// It panics if fi is nil.
 func ToFsFileInfo(fi *wshrpc.FileInfo) FsFileInfo {
+    if fi == nil {
+        panic("ToFsFileInfo: nil FileInfo")
+    }
     return FsFileInfo{
         NameInternal:    fi.Name,
         ModeInternal:    fi.Mode,
         SizeInternal:    fi.Size,
         ModTimeInternal: fi.ModTime,
         IsDirInternal:   fi.IsDir,
     }
 }
🧹 Nitpick comments (1)
pkg/util/fileutil/fileutil.go (1)

168-174: Add documentation for FsFileInfo type.

Please add documentation explaining the purpose of this type and its relationship to fs.FileInfo interface.

+// FsFileInfo implements fs.FileInfo interface and provides a way to represent file information
+// in a format that can be easily serialized and transmitted between different parts of the system.
 type FsFileInfo struct {
     NameInternal    string
     ModeInternal    os.FileMode
     SizeInternal    int64
     ModTimeInternal int64
     IsDirInternal   bool
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b28f01 and 317dae4.

📒 Files selected for processing (1)
  • pkg/util/fileutil/fileutil.go (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build for TestDriver.ai
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
pkg/util/fileutil/fileutil.go (4)

15-15: LGTM! Required imports added.

The new imports support the added functionality - time for ModTime operations and wshrpc for FileInfo type conversion.

Also applies to: 18-18


22-22: LGTM! Improved error handling.

Good changes to prevent variable shadowing and properly handle filepath.Abs errors.

Also applies to: 26-29


200-200: LGTM! Good compile-time interface check.

The type assertion ensures FsFileInfo properly implements fs.FileInfo interface.


188-190: Add validation for ModTime conversion.

The ModTime conversion from nanoseconds should handle potential overflow cases.

 func (f FsFileInfo) ModTime() time.Time {
+    if f.ModTimeInternal < 0 {
+        return time.Time{}
+    }
     return time.Unix(0, f.ModTimeInternal)
 }

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pkg/wshutil/wshutil.go (1)

487-488: LGTM! Consider documenting the cleanup sequence.

The addition of channel closing in the cleanup phase is correct and follows Go's best practices for resource management. This ensures proper cleanup of communication channels when the connection is terminated.

Consider adding a comment to document the cleanup sequence:

 defer func() {
     conn.Close()
+    // Close communication channels to prevent resource leaks
     close(proxy.FromRemoteCh)
     close(proxy.ToRemoteCh)
pkg/wshutil/wshrpcio.go (1)

25-42: Consider documenting the cleanup pattern for reuse.

This cleanup pattern is valuable for streaming operations. Consider:

  1. Adding comments explaining the pattern for future maintainers
  2. Creating a reusable helper if this pattern is needed elsewhere in the codebase
  3. Documenting this as a best practice for handling streaming errors
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 317dae4 and 9265cd0.

📒 Files selected for processing (2)
  • pkg/wshutil/wshrpcio.go (1 hunks)
  • pkg/wshutil/wshutil.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build for TestDriver.ai
  • GitHub Check: Analyze (go)
🔇 Additional comments (2)
pkg/wshutil/wshrpcio.go (2)

36-37: LGTM! Comprehensive error handling with proper cleanup.

The error handling is well-implemented:

  • Sets cleanup flag before returning
  • Wraps errors with descriptive context
  • Consistently handles both data and newline write errors

Also applies to: 41-42


25-33: LGTM! Clean error handling with proper resource cleanup.

The deferred cleanup mechanism is well-implemented:

  • Prevents goroutine leaks by only starting cleanup goroutine when needed
  • Ensures channel is drained on errors, preventing blocked goroutines
  • No race conditions as the drain flag is properly scoped

Let's verify the cleanup behavior:

✅ Verification successful

Cleanup pattern is consistently implemented across the codebase

The implementation follows the established pattern in the codebase:

  • Similar cleanup mechanism already exists in pkg/web/web.go
  • Both use flag-controlled deferred cleanup
  • Both prevent goroutine leaks in streaming operations
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify that this cleanup pattern is consistently used across similar streaming operations

# Look for other streaming operations that might benefit from this pattern
rg -l "chan \[\]byte.*io\.Writer" --type go

# Check for potential goroutine leaks in error handling
ast-grep --pattern 'go func() {
  for range $_ {
  }
}()'

Length of output: 412


Script:

#!/bin/bash
# Get context around the cleanup pattern in web.go
rg "for range rtnCh" -B 10 -A 5 pkg/web/web.go

# Check error handling patterns around both cleanup locations
ast-grep --pattern 'if err != nil {
  $$$
  go func() {
    for range $_ {
    }
  }()
  $$$
}'

Length of output: 466

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants