diff --git a/core/src/main/scala/org/apache/spark/util/Clock.scala b/core/src/main/scala/org/apache/spark/util/Clock.scala index d2674d4f47224..226f15d3d38c2 100644 --- a/core/src/main/scala/org/apache/spark/util/Clock.scala +++ b/core/src/main/scala/org/apache/spark/util/Clock.scala @@ -42,7 +42,7 @@ private[spark] trait Clock { * * TL;DR: on modern (2.6.32+) Linux kernels with modern (AMD K8+) CPUs, the values returned by * `System.nanoTime()` are consistent across CPU cores *and* packages, and provide always - * increasing values (although it may not be completely monotonic when the the system clock is + * increasing values (although it may not be completely monotonic when the system clock is * adjusted by NTP daemons using time slew). */ // scalastyle:on line.size.limit diff --git a/core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala b/core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala index 7221623f89e1b..a0da3ca5b5f3b 100644 --- a/core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala +++ b/core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala @@ -83,7 +83,7 @@ class SparkListenerSuite extends SparkFunSuite with LocalSparkContext with Match (1 to 5).foreach { _ => bus.post(SparkListenerJobEnd(0, jobCompletionTime, JobSucceeded)) } // Five messages should be marked as received and queued, but no messages should be posted to - // listeners yet because the the listener bus hasn't been started. + // listeners yet because the listener bus hasn't been started. assert(bus.metrics.numEventsPosted.getCount === 5) assert(bus.queuedEvents.size === 5) @@ -206,7 +206,7 @@ class SparkListenerSuite extends SparkFunSuite with LocalSparkContext with Match assert(sharedQueueSize(bus) === 1) assert(numDroppedEvents(bus) === 1) - // Allow the the remaining events to be processed so we can stop the listener bus: + // Allow the remaining events to be processed so we can stop the listener bus: listenerWait.release(2) bus.stop() } diff --git a/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala b/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala index 2b5993a352cb0..0b4e1494bf300 100644 --- a/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala +++ b/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala @@ -436,7 +436,7 @@ class ExternalAppendOnlyMapSuite extends SparkFunSuite val it = map.iterator assert(it.isInstanceOf[CompletionIterator[_, _]]) // org.apache.spark.util.collection.AppendOnlyMap.destructiveSortedIterator returns - // an instance of an annonymous Iterator class. + // an instance of an anonymous Iterator class. val underlyingMapRef = WeakReference(map.currentMap) diff --git a/docs/_data/menu-sql.yaml b/docs/_data/menu-sql.yaml index 36e0b99a07ffd..1149e4704be2e 100644 --- a/docs/_data/menu-sql.yaml +++ b/docs/_data/menu-sql.yaml @@ -233,5 +233,5 @@ url: sql-ref-functions-udf-scalar.html - text: Aggregate functions url: sql-ref-functions-udf-aggregate.html - - text: Arthmetic operations + - text: Arithmetic operations url: sql-ref-arithmetic-ops.html diff --git a/docs/configuration.md b/docs/configuration.md index 497a2ad36b67c..a02733fdbee89 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -2423,7 +2423,7 @@ showDF(properties, numRows = 200, truncate = FALSE) Interval at which data received by Spark Streaming receivers is chunked into blocks of data before storing them in Spark. Minimum recommended - 50 ms. See the performance - tuning section in the Spark Streaming programing guide for more details. + tuning section in the Spark Streaming programming guide for more details.
None | Optional Avro schema (in JSON format) that was used to serialize the data. This should be set if the schema provided for deserialization is compatible with - but not the same as - the one used to originally convert the data to Avro. - For more information on Avro's schema evolution and compatability, please refer to the [documentation of Confluent](https://docs.confluent.io/current/schema-registry/avro.html). + For more information on Avro's schema evolution and compatibility, please refer to the [documentation of Confluent](https://docs.confluent.io/current/schema-registry/avro.html). | function from_avro |
diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 1db2a7d41082b..674621f3fdfaf 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -220,7 +220,7 @@ license: |
- Since Spark 3.0, when casting interval values to string type, there is no "interval" prefix, e.g. `1 days 2 hours`. In Spark version 2.4 and earlier, the string contains the "interval" prefix like `interval 1 days 2 hours`.
- - Since Spark 3.0, when casting string value to integral types(tinyint, smallint, int and bigint), datetime types(date, timestamp and interval) and boolean type, the leading and trailing whitespaces(<= ACSII 32) will be trimmed before converted to these type values, e.g. `cast(' 1\t' as int)` results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as date)` results the date value `2019-10-10`. In Spark version 2.4 and earlier, while casting string to integrals and booleans, it will not trim the whitespaces from both ends, the foregoing results will be `null`, while to datetimes, only the trailing spaces(= ASCII 32) will be removed.
+ - Since Spark 3.0, when casting string value to integral types(tinyint, smallint, int and bigint), datetime types(date, timestamp and interval) and boolean type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed before converted to these type values, e.g. `cast(' 1\t' as int)` results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as date)` results the date value `2019-10-10`. In Spark version 2.4 and earlier, while casting string to integrals and booleans, it will not trim the whitespaces from both ends, the foregoing results will be `null`, while to datetimes, only the trailing spaces (= ASCII 32) will be removed.
- Since Spark 3.0, numbers written in scientific notation(e.g. `1E2`) would be parsed as Double. In Spark version 2.4 and earlier, they're parsed as Decimal. To restore the behavior before Spark 3.0, you can set `spark.sql.legacy.exponentLiteralAsDecimal.enabled` to `true`.
diff --git a/docs/sql-pyspark-pandas-with-arrow.md b/docs/sql-pyspark-pandas-with-arrow.md
index d638278b42355..7eb8a74547f70 100644
--- a/docs/sql-pyspark-pandas-with-arrow.md
+++ b/docs/sql-pyspark-pandas-with-arrow.md
@@ -255,7 +255,7 @@ different than a Pandas timestamp. It is recommended to use Pandas time series f
working with timestamps in `pandas_udf`s to get the best performance, see
[here](https://pandas.pydata.org/pandas-docs/stable/timeseries.html) for details.
-### Compatibiliy Setting for PyArrow >= 0.15.0 and Spark 2.3.x, 2.4.x
+### Compatibility Setting for PyArrow >= 0.15.0 and Spark 2.3.x, 2.4.x
Since Arrow 0.15.0, a change in the binary IPC format requires an environment variable to be
compatible with previous versions of Arrow <= 0.14.1. This is only necessary to do for PySpark
diff --git a/docs/sql-ref-null-semantics.md b/docs/sql-ref-null-semantics.md
index fd467d224ffd5..3cbc15c600cee 100644
--- a/docs/sql-ref-null-semantics.md
+++ b/docs/sql-ref-null-semantics.md
@@ -25,14 +25,14 @@ A column is associated with a data type and represents
a specific attribute of an entity (for example, `age` is a column of an
entity called `person`). Sometimes, the value of a column
specific to a row is not known at the time the row comes into existence.
-In `SQL`, such values are represnted as `NULL`. This section details the
+In `SQL`, such values are represented as `NULL`. This section details the
semantics of `NULL` values handling in various operators, expressions and
other `SQL` constructs.
1. [Null handling in comparison operators](#comp-operators)
2. [Null handling in Logical operators](#logical-operators)
3. [Null handling in Expressions](#expressions)
- 1. [Null handling in null-in-tolerant expressions](#null-in-tolerant)
+ 1. [Null handling in null-intolerant expressions](#null-intolerant)
2. [Null handling Expressions that can process null value operands](#can-process-null)
3. [Null handling in built-in aggregate expressions](#built-in-aggregate)
4. [Null handling in WHERE, HAVING and JOIN conditions](#condition-expressions)
@@ -61,10 +61,10 @@ the `age` column and this table will be used in various examples in the sections
700 | Dan | 50 |