NVIDIA
diff --git a/‎docs/cuda-bindings/jupyter_execute/overview.ipynb
Lines changed: 25 additions & 25 deletions b/‎docs/cuda-bindings/jupyter_execute/overview.ipynb
Lines changed: 25 additions & 25 deletions
diff --git a/‎docs/cuda-bindings/latest/.doctrees/environment.pickle
0 Bytes b/‎docs/cuda-bindings/latest/.doctrees/environment.pickle
0 Bytes
diff --git a/‎docs/cuda-core/latest/.doctrees/environment.pickle
411 Bytes b/‎docs/cuda-core/latest/.doctrees/environment.pickle
411 Bytes
diff --git a/‎docs/cuda-core/latest/.doctrees/install.doctree
2.35 KB b/‎docs/cuda-core/latest/.doctrees/install.doctree
2.35 KB
diff --git a/‎docs/cuda-core/latest/.doctrees/release/0.1.1-notes.doctree
-4 Bytes b/‎docs/cuda-core/latest/.doctrees/release/0.1.1-notes.doctree
-4 Bytes
diff --git a/‎docs/cuda-core/latest/_sources/install.md.txt
Lines changed: 15 additions & 0 deletions b/‎docs/cuda-core/latest/_sources/install.md.txt
Lines changed: 15 additions & 0 deletions
diff --git a/‎docs/cuda-core/latest/_sources/release/0.1.1-notes.md.txt
Lines changed: 1 addition & 1 deletion b/‎docs/cuda-core/latest/_sources/release/0.1.1-notes.md.txt
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/cuda-core/latest/index.html
Lines changed: 1 addition & 0 deletions b/‎docs/cuda-core/latest/index.html
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/cuda-core/latest/install.html
Lines changed: 15 additions & 0 deletions b/‎docs/cuda-core/latest/install.html
Lines changed: 15 additions & 0 deletions
diff --git a/‎docs/cuda-core/latest/release.html
Lines changed: 1 addition & 1 deletion b/‎docs/cuda-core/latest/release.html
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/cuda-core/latest/release/0.1.1-notes.html
Lines changed: 3 additions & 3 deletions b/‎docs/cuda-core/latest/release/0.1.1-notes.html
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/cuda-core/latest/searchindex.js
Lines changed: 1 addition & 1 deletion b/‎docs/cuda-core/latest/searchindex.js
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/latest/.doctrees/environment.pickle
0 Bytes b/‎docs/latest/.doctrees/environment.pickle
0 Bytes
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "2c2ee760",
+   "id": "89cc298e",
    "metadata": {},
    "source": [
     "# Overview\n",
@@ -50,7 +50,7 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "id": "eec9439d",
+   "id": "fbbf48f8",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -60,7 +60,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "a02f0fed",
+   "id": "e7856b1c",
    "metadata": {},
    "source": [
     "Error checking is a fundamental best practice in code development and a code\n",
@@ -72,7 +72,7 @@
   {
    "cell_type": "code",
    "execution_count": 2,
-   "id": "aba234e7",
+   "id": "a15ca753",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -98,7 +98,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "aedde00f",
+   "id": "2f6edb25",
    "metadata": {},
    "source": [
     "It’s common practice to write CUDA kernels near the top of a translation unit,\n",
@@ -112,7 +112,7 @@
   {
    "cell_type": "code",
    "execution_count": 3,
-   "id": "9501b26e",
+   "id": "ad3b35ea",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -130,7 +130,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "ba8aa6dd",
+   "id": "3b497b44",
    "metadata": {},
    "source": [
     "Go ahead and compile the kernel into PTX. Remember that this is executed at runtime using NVRTC. There are three basic steps to NVRTC:\n",
@@ -147,7 +147,7 @@
   {
    "cell_type": "code",
    "execution_count": 4,
-   "id": "3f34e779",
+   "id": "183f49bc",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -177,7 +177,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "fb09c972",
+   "id": "0981c1a8",
    "metadata": {},
    "source": [
     "Before you can use the PTX or do any work on the GPU, you must create a CUDA\n",
@@ -189,7 +189,7 @@
   {
    "cell_type": "code",
    "execution_count": 5,
-   "id": "ccb32289",
+   "id": "0fb562ab",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -199,7 +199,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "a2f56747",
+   "id": "d8331acd",
    "metadata": {},
    "source": [
     "With a CUDA context created on device 0, load the PTX generated earlier into a\n",
@@ -211,7 +211,7 @@
   {
    "cell_type": "code",
    "execution_count": 6,
-   "id": "d4fbd234",
+   "id": "fb3af604",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -224,7 +224,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "84b6af96",
+   "id": "cfda3062",
    "metadata": {},
    "source": [
     "Next, get all your data prepared and transferred to the GPU. For increased\n",
@@ -236,7 +236,7 @@
   {
    "cell_type": "code",
    "execution_count": 7,
-   "id": "857cb9dc",
+   "id": "a7678a2f",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -254,7 +254,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "299122b9",
+   "id": "0a7b45b0",
    "metadata": {},
    "source": [
     "With the input data `a`, `x`, and `y` created for the SAXPY transform device,\n",
@@ -271,7 +271,7 @@
   {
    "cell_type": "code",
    "execution_count": 8,
-   "id": "19f9e83f",
+   "id": "d459bd0b",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -291,7 +291,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "6e703eba",
+   "id": "39ccaa76",
    "metadata": {},
    "source": [
     "With data prep and resources allocation finished, the kernel is ready to be\n",
@@ -308,7 +308,7 @@
   {
    "cell_type": "code",
    "execution_count": 9,
-   "id": "5a917142",
+   "id": "d50cc757",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -324,7 +324,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "65d9e077",
+   "id": "9137714f",
    "metadata": {},
    "source": [
     "Now the kernel can be launched:"
@@ -333,7 +333,7 @@
   {
    "cell_type": "code",
    "execution_count": 10,
-   "id": "310fdfe7",
+   "id": "2963c1cd",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -359,7 +359,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "c8c842c7",
+   "id": "4bc876e6",
    "metadata": {},
    "source": [
     "The `cuLaunchKernel` function takes the compiled module kernel and execution\n",
@@ -374,7 +374,7 @@
   {
    "cell_type": "code",
    "execution_count": 11,
-   "id": "3adea158",
+   "id": "ac71e923",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -386,7 +386,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "96dd7306",
+   "id": "ea813e42",
    "metadata": {},
    "source": [
     "Perform verification of the data to ensure correctness and finish the code with\n",
@@ -396,7 +396,7 @@
   {
    "cell_type": "code",
    "execution_count": 12,
-   "id": "cd0c402e",
+   "id": "c29ce423",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -410,7 +410,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "d08fe6ce",
+   "id": "a4a86e92",
    "metadata": {},
    "source": [
     "## Performance\n",
 
@@ -25,6 +25,21 @@ and likewise use `[cu11]` for CUDA 11.
 Note that using `cuda.core` with NVRTC or nvJitLink installed from PyPI via `pip install` is currently
 not supported. This will be fixed in a future release.
 
+## Installing from Conda (conda-forge)
+
+Same as above, `cuda.core` can be installed in a CUDA 11 or 12 environment. For example with CUDA 12:
+```console
+$ conda install -c conda-forge cuda-core cuda-version=12
+```
+and likewise use `cuda-version=11` for CUDA 11.
+
+Note that to use `cuda.core` with nvJitLink installed from conda-forge currently requires it to
+be separately installed:
+```console
+$ conda install -c conda-forge libnvjitlink
+```
+(can be combined with the command above). This extra step will be removed in a future release.
+
 ## Installing from Source
 
 ```console
 
@@ -16,7 +16,7 @@ Released on Dec 20, 2024
 - Add a `cuda.core.experimental.system` module for querying system- or process- wide information.
 - Add `LaunchConfig.cluster` to support thread block clusters on Hopper GPUs.
 
-## Enchancements
+## Enhancements
 
 - The internal handle held by `ObjectCode` is now lazily initialized upon first touch.
 - Support TCC devices with a default synchronous memory resource to avoid the use of memory pools.
 
@@ -304,6 +304,7 @@ <h1><code class="docutils literal notranslate"><span class="pre">cuda.core</span
 <li class="toctree-l1"><a class="reference internal" href="install.html">Installation</a><ul>
 <li class="toctree-l2"><a class="reference internal" href="install.html#runtime-requirements">Runtime Requirements</a></li>
 <li class="toctree-l2"><a class="reference internal" href="install.html#installing-from-pypi">Installing from PyPI</a></li>
+<li class="toctree-l2"><a class="reference internal" href="install.html#installing-from-conda-conda-forge">Installing from Conda (conda-forge)</a></li>
 <li class="toctree-l2"><a class="reference internal" href="install.html#installing-from-source">Installing from Source</a></li>
 </ul>
 </li>
 
@@ -327,6 +327,20 @@ <h2>Installing from PyPI<a class="headerlink" href="#installing-from-pypi" title
 <p>Note that using <code class="docutils literal notranslate"><span class="pre">cuda.core</span></code> with NVRTC or nvJitLink installed from PyPI via <code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span></code> is currently
 not supported. This will be fixed in a future release.</p>
 </section>
+<section id="installing-from-conda-conda-forge">
+<h2>Installing from Conda (conda-forge)<a class="headerlink" href="#installing-from-conda-conda-forge" title="Link to this heading">¶</a></h2>
+<p>Same as above, <code class="docutils literal notranslate"><span class="pre">cuda.core</span></code> can be installed in a CUDA 11 or 12 environment. For example with CUDA 12:</p>
+<div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>conda<span class="w"> </span>install<span class="w"> </span>-c<span class="w"> </span>conda-forge<span class="w"> </span>cuda-core<span class="w"> </span>cuda-version<span class="o">=</span><span class="m">12</span>
+</pre></div>
+</div>
+<p>and likewise use <code class="docutils literal notranslate"><span class="pre">cuda-version=11</span></code> for CUDA 11.</p>
+<p>Note that to use <code class="docutils literal notranslate"><span class="pre">cuda.core</span></code> with nvJitLink installed from conda-forge currently requires it to
+be separately installed:</p>
+<div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>conda<span class="w"> </span>install<span class="w"> </span>-c<span class="w"> </span>conda-forge<span class="w"> </span>libnvjitlink
+</pre></div>
+</div>
+<p>(can be combined with the command above). This extra step will be removed in a future release.</p>
+</section>
 <section id="installing-from-source">
 <h2>Installing from Source<a class="headerlink" href="#installing-from-source" title="Link to this heading">¶</a></h2>
 <div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>git<span class="w"> </span>clone<span class="w"> </span>https://github.com/NVIDIA/cuda-python
@@ -403,6 +417,7 @@ <h2>Installing from Source<a class="headerlink" href="#installing-from-source" t
 <li><a class="reference internal" href="#">Installation</a><ul>
 <li><a class="reference internal" href="#runtime-requirements">Runtime Requirements</a></li>
 <li><a class="reference internal" href="#installing-from-pypi">Installing from PyPI</a></li>
+<li><a class="reference internal" href="#installing-from-conda-conda-forge">Installing from Conda (conda-forge)</a></li>
 <li><a class="reference internal" href="#installing-from-source">Installing from Source</a></li>
 </ul>
 </li>
 
@@ -296,7 +296,7 @@ <h1>Release Notes<a class="headerlink" href="#release-notes" title="Link to this
 <li class="toctree-l1"><a class="reference internal" href="release/0.1.1-notes.html">    0.1.1</a><ul>
 <li class="toctree-l2"><a class="reference internal" href="release/0.1.1-notes.html#hightlights">Hightlights</a></li>
 <li class="toctree-l2"><a class="reference internal" href="release/0.1.1-notes.html#new-features">New features</a></li>
-<li class="toctree-l2"><a class="reference internal" href="release/0.1.1-notes.html#enchancements">Enchancements</a></li>
+<li class="toctree-l2"><a class="reference internal" href="release/0.1.1-notes.html#enhancements">Enhancements</a></li>
 <li class="toctree-l2"><a class="reference internal" href="release/0.1.1-notes.html#bug-fixes">Bug fixes</a></li>
 <li class="toctree-l2"><a class="reference internal" href="release/0.1.1-notes.html#limitations">Limitations</a></li>
 </ul>
 
@@ -310,8 +310,8 @@ <h2>New features<a class="headerlink" href="#new-features" title="Link to this h
 <li><p>Add <code class="docutils literal notranslate"><span class="pre">LaunchConfig.cluster</span></code> to support thread block clusters on Hopper GPUs.</p></li>
 </ul>
 </section>
-<section id="enchancements">
-<h2>Enchancements<a class="headerlink" href="#enchancements" title="Link to this heading">¶</a></h2>
+<section id="enhancements">
+<h2>Enhancements<a class="headerlink" href="#enhancements" title="Link to this heading">¶</a></h2>
 <ul class="simple">
 <li><p>The internal handle held by <code class="docutils literal notranslate"><span class="pre">ObjectCode</span></code> is now lazily initialized upon first touch.</p></li>
 <li><p>Support TCC devices with a default synchronous memory resource to avoid the use of memory pools.</p></li>
@@ -402,7 +402,7 @@ <h2>Limitations<a class="headerlink" href="#limitations" title="Link to this hea
 <li><a class="reference internal" href="#"><code class="docutils literal notranslate"><span class="pre">cuda.core</span></code> v0.1.1 Release notes</a><ul>
 <li><a class="reference internal" href="#hightlights">Hightlights</a></li>
 <li><a class="reference internal" href="#new-features">New features</a></li>
-<li><a class="reference internal" href="#enchancements">Enchancements</a></li>
+<li><a class="reference internal" href="#enhancements">Enhancements</a></li>
 <li><a class="reference internal" href="#bug-fixes">Bug fixes</a></li>
 <li><a class="reference internal" href="#limitations">Limitations</a></li>
 </ul>