A collection of what I've learned in big data.
I am a self learner and hope these materials could prove that I am a strong candicate.
TODO
- Still some Flink jobs not added up
- Some python pandas/backend projects...
Big Data | Project |
---|---|
Data warehouse(Java) | ✅Sqoop / SparkSQL / HDFS API / HBase API / JDBC |
Real-time Streaming | ✅Flink / Kafka |
Algorithms | ✅ Java |
Java | ✅ Multi-thread data generator |
Python | ✅ Data faker |
Rust | ✅ Command line stock watcher |
Linux scripts | ✅ Useful Linux scripts |
This is what I've learned to build up a data warehouse.
Data Warehouse Projects | Detail |
---|---|
Sqoop | Several ETL scripts I wrote from RDBMS to Spark on Hive |
Elastic Search | Es + Esrally benchmark test |
SparkSQL | Useful SparkSQLs I wrote |
Apache Kylin | Use Apache Kylin to accelerate queries |
HDFS Operations | HDFS operation api in Java |
HDFS Kerberos Login | HDFS Kerberos Multi-thread Login method |
HBase API with Kerberos | HBase API with Kerberos Login |
These are some codes I wrote for streaming analysis.
Streaming Projects | Detail |
---|---|
Flink Filter with Kafka | Flink benchmark test for data filtering with Kafka |
Flink Window Count with Kafka | Flink benchmark test for window-grouping count with Kafka |
Flink Tumbling Window with Kafka | Flink benchmark test for tumbling-window with Kafka |
Kafka Kerberos scripts | Useful Kafka cmd client scripts I wrote |
Kafka Multi-thread Producer | Kafka Java producer for benchmark testing |
Kafka Java Consumer with Kerberos | Kafka Java consumer for benchmark testing |
Algorithms I wrote before...
Algorithms implemented in Java |
---|
Bubble Sort |
Insert Sort |
Merge Sort |
Quick Sort |
Shell Sort |
Binary Insert Sort |
Insert Sort VS Bubble Sort |
Selection Sort |
Binary Search |
Other Java Projects |
---|
Data Generator |
Yet Another Data Generator |
Python Projects |
---|
Python data faker |
Rust Projects |
---|
Cmd Stock watcher |
Rust Projects |
---|
Cmd Stock watcher |