Skip to content

Commit

Permalink
updated to 0.1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
sadikovi committed Feb 14, 2016
1 parent 152b662 commit 900b48f
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 7 deletions.
16 changes: 11 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,25 +7,28 @@ A library for reading NetFlow files from [Spark SQL](http://spark.apache.org/doc
## Requirements
| Spark version | spark-netflow version |
|---------------|-----------------------|
| 1.4+ | [0.0.2](http://spark-packages.org/package/sadikovi/spark-netflow) |
| 1.4+ | [0.1.0](http://spark-packages.org/package/sadikovi/spark-netflow) |

## Linking
The spark-netflow library can be added to Spark by using the `--packages` command line option. For
example, run this to include it when starting the spark shell:
```shell
$SPARK_HOME/bin/spark-shell --packages sadikovi:spark-netflow:0.0.2-s_2.10
$SPARK_HOME/bin/spark-shell --packages sadikovi:spark-netflow:0.1.0-s_2.10
```

## Features
- Column pruning
- Predicate pushdown on capture time (`unix_secs` field)
- Fields conversion (IP addresses, protocol, etc.)
- NetFlow version 5 support
- NetFlow version 7 support

### Options
Currently supported options:

| Name | Example | Description |
|------|:-------:|-------------|
| `version` | _5_ | version to use when parsing NetFlow files
| `version` | _5, 7_ | version to use when parsing NetFlow files
| `buffer` | _1024, 32Kb, 3Mb, etc_ | buffer size for NetFlow compressed stream (default: 3Mb)
| `stringify` | _true, false_ | convert certain fields (e.g. IP) into human-readable format (default: false)

Expand All @@ -44,12 +47,15 @@ val df = sqlContext.read.format("com.github.sadikovi.spark.netflow").
option("version", "5").option("buffer", "50Mb").load("file:/...")
```

Alternatively you can use shortcut for NetFlow v5 files
Alternatively you can use shortcuts for NetFlow files
```scala
import com.github.sadikovi.spark.netflow._

// this will read version 5 with default buffer size
val df = sqlContext.read.netflow("file:/...")
val df = sqlContext.read.netflow5("file:/...")

// this will read version 7 with fields conversion
val df = sqlContext.read.option("stringify", "true").netflow7("file:/...")
```

### Python API
Expand Down
1 change: 0 additions & 1 deletion Untitled

This file was deleted.

2 changes: 1 addition & 1 deletion version.sbt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
version in ThisBuild := "0.1.0-SNAPSHOT"
version in ThisBuild := "0.1.0"

0 comments on commit 900b48f

Please sign in to comment.