@@ -1518,6 +1518,275 @@ parameters of protocol-qualified type.
1518
1518
Query the presence of this new mangling with
1519
1519
``__has_feature(objc_protocol_qualifier_mangling) ``.
1520
1520
1521
+
1522
+ OpenCL Features
1523
+ ===============
1524
+
1525
+ C++ for OpenCL
1526
+ --------------
1527
+
1528
+ This functionality is built on top of OpenCL C v2.0 and C++17. Regular C++
1529
+ features can be used in OpenCL kernel code. All functionality from OpenCL C
1530
+ is inherited. This section describes minor differences to OpenCL C and any
1531
+ limitations related to C++ support as well as interactions between OpenCL and
1532
+ C++ features that are not documented elsewhere.
1533
+
1534
+ Restrictions to C++17
1535
+ ^^^^^^^^^^^^^^^^^^^^^
1536
+
1537
+ The following features are not supported:
1538
+
1539
+ - Virtual functions
1540
+ - ``dynamic_cast `` operator
1541
+ - Non-placement ``new ``/``delete `` operators
1542
+ - Standard C++ libraries. Currently there is no solution for alternative C++
1543
+ libraries provided. Future release will feature library support.
1544
+
1545
+
1546
+ Interplay of OpenCL and C++ features
1547
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1548
+
1549
+ Address space behavior
1550
+ """"""""""""""""""""""
1551
+
1552
+ Address spaces are part of the type qualifiers; many rules are just inherited
1553
+ from the qualifier behavior documented in OpenCL C v2.0 s6.5 and Embedded C
1554
+ extension ISO/IEC JTC1 SC22 WG14 N1021 s3.1. Note that since the address space
1555
+ behavior in C++ is not documented formally yet, Clang extends existing concept
1556
+ from C and OpenCL. For example conversion rules are extended from qualification
1557
+ conversion but the compatibility is determined using sets and overlapping from
1558
+ Embedded C (ISO/IEC JTC1 SC22 WG14 N1021 s3.1.3). For OpenCL it means that
1559
+ implicit conversions are allowed from named to ``__generic `` but not vice versa
1560
+ (OpenCL C v2.0 s6.5.5) except for ``__constant `` address space. Most of the
1561
+ rules are built on top of this behavior.
1562
+
1563
+ **Casts **
1564
+
1565
+ C style cast will follow OpenCL C v2.0 rules (s6.5.5). All cast operators will
1566
+ permit implicit conversion to ``__generic ``. However converting from named
1567
+ address spaces to ``__generic `` can only be done using ``addrspace_cast ``. Note
1568
+ that conversions between ``__constant `` and any other is still disallowed.
1569
+
1570
+ .. _opencl_cpp_addrsp_deduction :
1571
+
1572
+ **Deduction **
1573
+
1574
+ Address spaces are not deduced for:
1575
+
1576
+ - non-pointer/non-reference template parameters or any dependent types except
1577
+ for template specializations.
1578
+ - non-pointer/non-reference class members except for static data members that are
1579
+ deduced to ``__global `` address space.
1580
+ - non-pointer/non-reference alias declarations.
1581
+ - ``decltype `` expression.
1582
+
1583
+ .. code-block :: c++
1584
+
1585
+ template <typename T>
1586
+ void foo () {
1587
+ T m; // address space of m will be known at template instantiation time.
1588
+ T * ptr; // ptr points to __generic address space object.
1589
+ T & ref = ...; // ref references an object in __generic address space.
1590
+ };
1591
+
1592
+ template <int N>
1593
+ struct S {
1594
+ int i; // i has no address space
1595
+ static int ii; // ii is in global address space
1596
+ int * ptr; // ptr points to __generic address space int.
1597
+ int & ref = ...; // ref references int in __generic address space.
1598
+ };
1599
+
1600
+ template <int N>
1601
+ void bar ()
1602
+ {
1603
+ S<N> s; // s is in __private address space
1604
+ }
1605
+
1606
+ TODO: Add example for type alias and decltype!
1607
+
1608
+ **References **
1609
+
1610
+ References types can be qualified with an address space.
1611
+
1612
+ .. code-block :: c++
1613
+
1614
+ __private int & ref = ...; // references int in __private address space
1615
+
1616
+ By default references will refer to ``__generic `` address space objects, except
1617
+ for dependent types that are not template specializations
1618
+ (see :ref: `Deduction <opencl_cpp_addrsp_deduction >`). Address space compatibility
1619
+ checks are performed when references are bound to values. The logic follows the
1620
+ rules from address space pointer conversion (OpenCL v2.0 s6.5.5).
1621
+
1622
+ **Default address space **
1623
+
1624
+ All non-static member functions take an implicit object parameter ``this `` that
1625
+ is a pointer type. By default this pointer parameter is in ``__generic `` address
1626
+ space. All concrete objects passed as an argument to ``this `` parameter will be
1627
+ converted to ``__generic `` address space first if the conversion is valid.
1628
+ Therefore programs using objects in ``__constant `` address space won't be compiled
1629
+ unless address space is explicitly specified using address space qualifiers on
1630
+ member functions
1631
+ (see :ref: `Member function qualifier <opencl_cpp_addrspace_method_qual >`) as the
1632
+ conversion between ``__constant `` and ``__generic `` is disallowed. Member function
1633
+ qualifiers can also be used in case conversion to ``__generic `` address space is
1634
+ undesirable (even if it is legal), for example to take advantage of memory bank
1635
+ accesses. Note this not only applies to regular member functions but to
1636
+ constructors and destructors too.
1637
+
1638
+ .. _opencl_cpp_addrspace_method_qual :
1639
+
1640
+ **Member function qualifier **
1641
+
1642
+ Clang allows specifying address space qualifier on member functions to signal that
1643
+ they are to be used with objects constructed in some specific address space. This
1644
+ works just the same as qualifying member functions with ``const `` or any other
1645
+ qualifiers. The overloading resolution will select overload with most specific
1646
+ address space if multiple candidates are provided. If there is no conversion to
1647
+ to an address space among existing overloads compilation will fail with a
1648
+ diagnostic.
1649
+
1650
+ .. code-block :: c++
1651
+
1652
+ struct C {
1653
+ void foo () __local;
1654
+ void foo ();
1655
+ };
1656
+
1657
+ __kernel void bar () {
1658
+ __local C c1;
1659
+ C c2;
1660
+ __constant C c3;
1661
+ c1.foo (); // will resolve to the first foo
1662
+ c2.foo (); // will resolve to the second foo
1663
+ c3.foo (); // error due to mismatching address spaces - can't convert to
1664
+ // __local or __generic
1665
+ }
1666
+
1667
+ **Implicit special members **
1668
+
1669
+ All implicit special members (default, copy, or move constructor, copy or move
1670
+ assignment, destructor) will be generated with ``__generic `` address space.
1671
+
1672
+ .. code-block :: c++
1673
+
1674
+ class C {
1675
+ // Has the following implicit definition
1676
+ // void C () __generic;
1677
+ // void C (const __generic C &) __generic;
1678
+ // void C (__generic C &&) __generic;
1679
+ // operator= '__generic C &(__generic C &&)'
1680
+ // operator= '__generic C &(const __generic C &) __generic
1681
+ }
1682
+
1683
+ **Builtin operators **
1684
+
1685
+ All builtin operators are available in the specific address spaces, thus no conversion
1686
+ to ``__generic `` is performed.
1687
+
1688
+ **Templates **
1689
+
1690
+ There is no deduction of address spaces in non-pointer/non-reference template parameters
1691
+ and dependent types (see :ref: `Deduction <opencl_cpp_addrsp_deduction >`). The address
1692
+ space of template parameter is deduced during the type deduction if it's not explicitly
1693
+ provided in instantiation.
1694
+
1695
+ .. code-block :: c++
1696
+
1697
+ 1 template<typename T>
1698
+ 2 void foo (T* i){
1699
+ 3 T var;
1700
+ 4 }
1701
+ 5
1702
+ 6 __global int g;
1703
+ 7 void bar (){
1704
+ 8 foo (&g); // error: template instantiation failed as function scope variable appears to
1705
+ 9 // be declared in __global address space (see line 3)
1706
+ 10 }
1707
+
1708
+ It is not legal to specify multiple different address spaces between template definition and
1709
+ instantiation. If multiple different address spaces are specified in template definition and
1710
+ instantiation compilation of such program will fail with a diagnostic.
1711
+
1712
+ .. code-block :: c++
1713
+
1714
+ template <typename T>
1715
+ void foo () {
1716
+ __private T var;
1717
+ }
1718
+
1719
+ void bar () {
1720
+ foo<__global int>(); // error: conflicting address space qualifiers are provided __global
1721
+ // and __private
1722
+ }
1723
+
1724
+ Once template is instantiated regular restrictions for address spaces will apply.
1725
+
1726
+ .. code-block :: c++
1727
+
1728
+ template<typename T>
1729
+ void foo (){
1730
+ T var;
1731
+ }
1732
+
1733
+ void bar (){
1734
+ foo<__global int>(); // error: function scope variable cannot be declared in __global
1735
+ // address space
1736
+ }
1737
+
1738
+ **Temporary materialization **
1739
+
1740
+ All temporaries are materialized in ``__private `` address space. If a reference with some
1741
+ other address space is bound to them, the conversion will be generated in case it's valid
1742
+ otherwise compilation will fail with a diagnostic.
1743
+
1744
+ .. code-block :: c++
1745
+
1746
+ int bar (const unsigned int &i);
1747
+
1748
+ void foo () {
1749
+ bar (1); // temporary is created in __private address space but converted
1750
+ // to __generic address space of parameter reference
1751
+ }
1752
+
1753
+ __global const int& f (__global float &ref) {
1754
+ return ref; // error: address space mismatch between temporary object
1755
+ // created to hold value converted float->int and return
1756
+ // value type (can't convert from __private to __global)
1757
+ }
1758
+
1759
+ **Initialization of local and constant address space objects **
1760
+
1761
+ TODO
1762
+
1763
+ Constructing and destroying global objects
1764
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1765
+
1766
+ Global objects are constructed before the first kernel using the global
1767
+ objects is executed and destroyed just after the last kernel using the
1768
+ program objects is executed. In OpenCL v2.0 drivers there is no specific
1769
+ API for invoking global constructors. However, an easy workaround would be
1770
+ to enqueue constructor initialization kernel that has a name
1771
+ ``@_GLOBAL__sub_I_<compiled file name> ``. This kernel is only present if there
1772
+ are any global objects to be initialized in the compiled binary. One way to
1773
+ check this is by passing ``CL_PROGRAM_KERNEL_NAMES `` to ``clGetProgramInfo ``
1774
+ (OpenCL v2.0 s5.8.7).
1775
+
1776
+ Note that if multiple files are compiled and linked into libraries multiple
1777
+ kernels that initialize global objects for multiple modules would have to be
1778
+ invoked.
1779
+
1780
+ .. code-block :: console
1781
+
1782
+ clang -cl-std=c++ test.cl
1783
+
1784
+ If there are any global objects to be initialized the final binary will
1785
+ contain ``@_GLOBAL__sub_I_test.cl `` kernel to be enqueued.
1786
+
1787
+ Global destructors can not be invoked in OpenCL v2.0 drivers. However, all
1788
+ memory used for program scope objects is released on ``clReleaseProgram ``.
1789
+
1521
1790
Initializer lists for complex numbers in C
1522
1791
==========================================
1523
1792
0 commit comments