Skip to content

TangZhongham/learned-big-data

Repository files navigation

learned-big-data

A collection of what I've learned in big data.

I am a self learner and hope these materials could prove that I am a strong candicate.

TODO

  • Still some Flink jobs not added up
  • Some python pandas/backend projects...

Whole Stack

Big Data Project
Data warehouse(Java) ✅Sqoop / SparkSQL / HDFS API / HBase API / JDBC
Real-time Streaming ✅Flink / Kafka
Algorithms ✅ Java
Java ✅ Multi-thread data generator
Python ✅ Data faker
Rust ✅ Command line stock watcher
Linux scripts ✅ Useful Linux scripts

Data Warehouse

This is what I've learned to build up a data warehouse.

Data Warehouse Projects Detail
Sqoop Several ETL scripts I wrote from RDBMS to Spark on Hive
Elastic Search Es + Esrally benchmark test
SparkSQL Useful SparkSQLs I wrote
Apache Kylin Use Apache Kylin to accelerate queries
HDFS Operations HDFS operation api in Java
HDFS Kerberos Login HDFS Kerberos Multi-thread Login method
HBase API with Kerberos HBase API with Kerberos Login

Real-time Streaming

These are some codes I wrote for streaming analysis.

Streaming Projects Detail
Flink Filter with Kafka Flink benchmark test for data filtering with Kafka
Flink Window Count with Kafka Flink benchmark test for window-grouping count with Kafka
Flink Tumbling Window with Kafka Flink benchmark test for tumbling-window with Kafka
Kafka Kerberos scripts Useful Kafka cmd client scripts I wrote
Kafka Multi-thread Producer Kafka Java producer for benchmark testing
Kafka Java Consumer with Kerberos Kafka Java consumer for benchmark testing

Algorithms

Algorithms I wrote before...

Algorithms implemented in Java
Bubble Sort
Insert Sort
Merge Sort
Quick Sort
Shell Sort
Binary Insert Sort
Insert Sort VS Bubble Sort
Selection Sort
Binary Search

Java

Other Java Projects
Data Generator
Yet Another Data Generator

Python

Python Projects
Python data faker

Rust

Rust Projects
Cmd Stock watcher

Linux scripts

Rust Projects
Cmd Stock watcher

About

A collection of what I've learned

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages