GoExercise

General Info:

This program does a distributed grep using the MapReduce paradigm, typical of big data problems. The GoGrep program returns the lines of a large text file given in input that match a specific regex specified (i.e., a regular expression). The input file is chosen by the end user. The program is written in Go and uses RPC for the communication between client, master and worker peers.

Running:

To run the program, is necessary to first set up the master and the worker peers and initialize each corresponding module. If the file go.mod is not in the repository GoExercise, then:

Set the modules of the repository:

go mod init GoExercise

Add module requirements and sums:

go mod tidy

Now can run the code.

Run the master:

Open a new terminal and input the following command:

go run master/master.go

Run the worker:

Open a new terminal and input the following command:

go run worker/worker.go

Once the master and the worker peer are up, the client also can run connecting to the master on the random port in which it's listening for incoming connections.

Run the client:

Open a new terminal and input the following command:

go run client/client.go

It' possible to choose a specific file to grep, and in that case the file must be present in the directory client/files. Then the user must specify the regex. It's possible to choose multiple regexes to use for the grep on the file chosen. If any of the lines of the text contains the regex specified, then the master returns the set of those lines.

MapReduce

Once the file is chosen, the master splits its content in multiple chunks. Each chunk is then distributed to a different worker. The master spawns N workers and the number N of workers is determined equally distributing the total lines of the files, such that workers processes a maximum of 10 lines each. Each worker performs the following operations on the chunk of the file received by the master:

Map

Then each worker returns the result of the mapping to the master, that proceeds performing the following operations:

Shuffle & Sort

The master then spawns one or more workers that perform the following operation:

Reduce

The master receives the results from the workers and merges the results obtained. Finally, it returns the subset of lines to the client.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
client		client
master		master
worker		worker
.gitignore		.gitignore
README.md		README.md
golang_icon_no_bg.png		golang_icon_no_bg.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GoExercise

General Info:

Running:

Run the master:

Run the worker:

Run the client:

MapReduce

About

Releases

Packages

Contributors 2

Languages

Diana0422/GoExercise

Folders and files

Latest commit

History

Repository files navigation

GoExercise

General Info:

Running:

Run the master:

Run the worker:

Run the client:

MapReduce

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages