Skip to content

Commit 9c37b1f

Browse files
ARFjreback
authored andcommitted
Introduction of RangeIndex
`RangeIndex(1, 10, 2)` is a memory saving alternative to `Index(np.arange(1, 10,2))`: c.f. #939. This re-implementation is compatible with the current `Index()` api and is a drop-in replacement for `Int64Index()`. It automatically converts to Int64Index() when required by operations. At present only for a minimum number of operations the type is conserved (e.g. slicing, inner-, left- and right-joins). Most other operations trigger creation of an equivalent Int64Index (or at least an equivalent numpy array) and fall back to its implementation. This PR also extends the functionality of the `Index()` constructor to allow creation of `RangeIndexes()` with ``` Index(20) Index(2, 20) Index(0, 20, 2) ``` in analogy to ``` range(20) range(2, 20) range(0, 20, 2) ``` restore Index() fastpath precedence Various fixes suggested by @jreback and @shoyer Cache a private Int64Index object the first time it or its values are required. Restore Index(5) as error. Restore its test. Allow Index(0, 5) and Index(0, 5, 1). Make RangeIndex immutable. See start, stop, step properties. In test_constructor(): check class, attributes (possibly including dtype). In test_copy(): check that copy is not identical (but equal) to the existing. In test_duplicates(): Assert is_unique and has_duplicates return correct values. fix slicing fix view Set RangeIndex as default index * enh: set RangeIndex as default index * fix: pandas.io.packers: encode() and decode() for RangeIndex * enh: array argument pass-through * fix: reindex * fix: use _default_index() in pandas.core.frame.extract_index() * fix: pandas.core.index.Index._is() * fix: add RangeIndex to ABCIndexClass * fix: use _default_index() in _get_names_from_index() * fix: pytables tests * fix: MultiIndex.get_level_values() * fix: RangeIndex._shallow_copy() * fix: null-size RangeIndex equals() comparison * enh: make RangeIndex.is_unique immutable enh: various performance optimizations * optimize argsort() * optimize tolist() * comment clean-up
1 parent 1945eed commit 9c37b1f

File tree

7 files changed

+1068
-27
lines changed

7 files changed

+1068
-27
lines changed

pandas/core/api.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
from pandas.core.categorical import Categorical
99
from pandas.core.groupby import Grouper
1010
from pandas.core.format import set_eng_float_format
11-
from pandas.core.index import Index, CategoricalIndex, Int64Index, Float64Index, MultiIndex
11+
from pandas.core.index import Index, CategoricalIndex, Int64Index, RangeIndex, Float64Index, MultiIndex
1212

1313
from pandas.core.series import Series, TimeSeries
1414
from pandas.core.frame import DataFrame

pandas/core/common.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,8 @@ def _check(cls, inst):
8484
ABCIndex = create_pandas_abc_type("ABCIndex", "_typ", ("index", ))
8585
ABCInt64Index = create_pandas_abc_type("ABCInt64Index", "_typ",
8686
("int64index", ))
87+
ABCRangeIndex = create_pandas_abc_type("ABCRangeIndex", "_typ",
88+
("rangeindex", ))
8789
ABCFloat64Index = create_pandas_abc_type("ABCFloat64Index", "_typ",
8890
("float64index", ))
8991
ABCMultiIndex = create_pandas_abc_type("ABCMultiIndex", "_typ",
@@ -97,7 +99,8 @@ def _check(cls, inst):
9799
ABCCategoricalIndex = create_pandas_abc_type("ABCCategoricalIndex", "_typ",
98100
("categoricalindex", ))
99101
ABCIndexClass = create_pandas_abc_type("ABCIndexClass", "_typ",
100-
("index", "int64index", "float64index",
102+
("index", "int64index", "rangeindex",
103+
"float64index",
101104
"multiindex", "datetimeindex",
102105
"timedeltaindex", "periodindex",
103106
"categoricalindex"))
@@ -1796,10 +1799,8 @@ def is_bool_indexer(key):
17961799

17971800

17981801
def _default_index(n):
1799-
from pandas.core.index import Int64Index
1800-
values = np.arange(n, dtype=np.int64)
1801-
result = Int64Index(values, name=None)
1802-
result.is_unique = True
1802+
from pandas.core.index import RangeIndex
1803+
result = RangeIndex(0, int(n), name=None)
18031804
return result
18041805

18051806

pandas/core/frame.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5325,7 +5325,7 @@ def extract_index(data):
53255325
(lengths[0], len(index)))
53265326
raise ValueError(msg)
53275327
else:
5328-
index = Index(np.arange(lengths[0]))
5328+
index = _default_index(lengths[0])
53295329

53305330
return _ensure_index(index)
53315331

@@ -5538,11 +5538,11 @@ def convert(arr):
55385538

55395539

55405540
def _get_names_from_index(data):
5541-
index = lrange(len(data))
55425541
has_some_name = any([getattr(s, 'name', None) is not None for s in data])
55435542
if not has_some_name:
5544-
return index
5543+
return _default_index(len(data))
55455544

5545+
index = lrange(len(data))
55465546
count = 0
55475547
for i, s in enumerate(data):
55485548
n = getattr(s, 'name', None)

0 commit comments

Comments
 (0)