Skip to content

Commit b24ddde

Browse files
committed
Add docs
1 parent 6b35e8f commit b24ddde

File tree

2 files changed

+263
-0
lines changed

2 files changed

+263
-0
lines changed

docs/make.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ const _PAGES = [
7070
],
7171
"Nonlinear" => [
7272
"Overview" => "submodules/Nonlinear/overview.md",
73+
"SymbolicAD" => "submodules/Nonlinear/SymbolicAD.md",
7374
"API Reference" => "submodules/Nonlinear/reference.md",
7475
],
7576
"Utilities" => [
Lines changed: 262 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,262 @@
1+
```@meta
2+
CurrentModule = MathOptInterface
3+
DocTestSetup = quote
4+
import MathOptInterface as MOI
5+
end
6+
DocTestFilters = [r"MathOptInterface|MOI"]
7+
```
8+
9+
# SymbolicAD
10+
11+
The `Nonlinear.SymbolicAD` submodule contains data structures and functions for
12+
working with the symbolic derivatives of a nonlinear optimization problem. This
13+
page explains the API and describes the rationale behind its design.
14+
15+
## Background
16+
17+
The code in `SymbolicAD` is inspired by Hassan Hijazi's work on
18+
[coin-or/gravity](https://github.com/coin-or/Gravity), a high-performance
19+
algebraic modeling language in C++.
20+
21+
Hassan made the following observations:
22+
23+
* For large scale models, symbolic differentiation is slower than other
24+
automatic differentiation techniques, such as the reverse mode algorithm
25+
implemented in `MOI.Nonlinear.ReverseAD`.
26+
* However, most large-scale nonlinear programs have a lot of structure.
27+
* Gravity asks the user to provide structure in the form of
28+
_template constraints_, where the user gives the symbolic form of the
29+
constraint as well as a set of data to convert from a symbolic form to the
30+
numerical form.
31+
* Instead of differentiating each constraint in its numerical form, we can
32+
compute one symbolic derivative of the constraint in symbolic form, and then
33+
plug in the data in to get the numerical derivative of each function.
34+
* As a final step, if users don't provide the structure, we can still infer it
35+
--perhaps with less accuracy--by comparing the expression tree of each
36+
constraint.
37+
38+
The symbolic differentiation approach of Gravity works well when the problem is
39+
large with few unique constraints. For example, a model like:
40+
```julia
41+
model = Model()
42+
@variable(model, 0 <= x[1:10_000] <= 1)
43+
@constraint(model, [i=1:10_000], sin(x[i]) <= 1)
44+
@objective(model, Max, sum(x))
45+
```
46+
is ideal, because although the Jacobian matrix has 10,000 rows, we can compute
47+
the derivative of `sin(x[i])` as `cos(x[i])`, and then fill in the Jacobian by
48+
evaluating the derivative function instead of having to differentiation 10,000
49+
expressions.
50+
51+
The symbolic differentiation approach of Gravity works poorly if there are a
52+
large number of unique constraints in the model (which would require a lot of
53+
expressions to be symbolically differentiated), or if the nonlinear functions
54+
contain a large number of nonlinear terms (which would make the symbolic
55+
derivative expensive to compute).
56+
57+
`SymbolicAD` started life as [MathOptSymbolicAD.jl](https://github.com/lanl-ansi/MathOptSymbolicAD.jl),
58+
development of which began in early 2022. This initial version of `SymbolicAD`
59+
used the `Symbolics.jl` package to compute the symbolic derivatives. In 2025, we
60+
rewrote MathOptSymbolicAD.jl to remove the dependence on `Symbolics.jl`, and,
61+
since the rewrite depended only on MathOptInterface, we added it to
62+
`MOI.Nonlinear` as a new submodule.
63+
64+
For more details on `MathOptSymbolicAD.jl`, see Oscar's [JuMP-dev 2022 talk](https://www.youtube.com/watch?v=d_X3gj3Iz-k),
65+
although note that the syntax has changed since the original recording.
66+
67+
## Use SymbolicAD with JuMP
68+
69+
To use `SymbolicAD` with JuMP, set the [`AutomaticDifferentiationBackend`](@ref)
70+
attribute to [`Nonlinear.SymbolicMode`](@ref):
71+
72+
```julia
73+
using JuMP, Ipopt
74+
model = Model(Ipopt.Optimizer)
75+
set_attribute(
76+
model,
77+
MOI.AutomaticDifferentiationBackend(),
78+
MOI.Nonlinear.SymbolicMode(),
79+
)
80+
@variable(model, x[1:2])
81+
@objective(model, Min, (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2)
82+
optimize!(model)
83+
```
84+
85+
To revert back to the default sparse reverse mode algorithm, set the
86+
[`AutomaticDifferentiationBackend`](@ref) attribute to
87+
[`Nonlinear.SparseReverseMode`](@ref).
88+
89+
## `simplilfy`
90+
91+
Use [`Nonlinear.SymbolicAD.simplify`](@ref) to simpify nonlinear expressions.
92+
The simplification algorithm performs simple rewrites such as lifting nested
93+
summations:
94+
95+
```jldoctest
96+
julia> x = MOI.VariableIndex(1)
97+
MOI.VariableIndex(1)
98+
99+
julia> f = MOI.ScalarNonlinearFunction(
100+
:+,
101+
Any[MOI.ScalarNonlinearFunction(:+, Any[1.0, x]), 2.0 * x + 3.0],
102+
)
103+
+(+(1.0, MOI.VariableIndex(1)), 3.0 + 2.0 MOI.VariableIndex(1))
104+
105+
julia> MOI.Nonlinear.SymbolicAD.simplify(f)
106+
+(1.0, MOI.VariableIndex(1), 3.0 + 2.0 MOI.VariableIndex(1))
107+
```
108+
109+
and trivial identities such as ``x^1 = x``:
110+
111+
```jldoctest
112+
julia> x = MOI.VariableIndex(1)
113+
MOI.VariableIndex(1)
114+
115+
julia> f = MOI.ScalarNonlinearFunction(:^, Any[x, 1])
116+
^(MOI.VariableIndex(1), (1))
117+
118+
julia> MOI.Nonlinear.SymbolicAD.simplify(f)
119+
MOI.VariableIndex(1)
120+
```
121+
122+
The list of rewrites that will be made is intentionally limited to keep the
123+
codebase simple. `Nonlinear.SymbolicAD` is not a substitute for a Computer
124+
Agebraic System (CAS). For example, we do not detect the relationship
125+
``sin(x)^2+cos(x)^2=1``:
126+
127+
```jldoctest
128+
julia> x = MOI.VariableIndex(1)
129+
MOI.VariableIndex(1)
130+
131+
julia> sin_x = MOI.ScalarNonlinearFunction(:sin, Any[x])
132+
sin(MOI.VariableIndex(1))
133+
134+
julia> cos_x = MOI.ScalarNonlinearFunction(:cos, Any[x])
135+
cos(MOI.VariableIndex(1))
136+
137+
julia> f = MOI.ScalarNonlinearFunction(
138+
:+,
139+
Any[
140+
MOI.ScalarNonlinearFunction(:^, Any[sin_x, 2]),
141+
MOI.ScalarNonlinearFunction(:^, Any[cos_x, 2]),
142+
],
143+
)
144+
+(^(sin(MOI.VariableIndex(1)), (2)), ^(cos(MOI.VariableIndex(1)), (2)))
145+
146+
julia> MOI.Nonlinear.SymbolicAD.simplify(f)
147+
+(^(sin(MOI.VariableIndex(1)), (2)), ^(cos(MOI.VariableIndex(1)), (2)))
148+
```
149+
150+
In addition to [`Nonlinear.SymbolicAD.simplify`](@ref), there is an in-place
151+
version [`Nonlinear.SymbolicAD.simplify!`](@ref) that may make changes to the
152+
existing function.
153+
154+
## `variables`
155+
156+
Use [`Nonlinear.SymbolicAD.variables`](@ref) to return a list of the variables
157+
that appear in the function:
158+
159+
```jldoctest
160+
julia> x = MOI.VariableIndex.(1:3)
161+
3-element Vector{MathOptInterface.VariableIndex}:
162+
MOI.VariableIndex(1)
163+
MOI.VariableIndex(2)
164+
MOI.VariableIndex(3)
165+
166+
julia> f = MOI.ScalarNonlinearFunction(:atan, Any[x[3], 2.0 * x[1]])
167+
atan(MOI.VariableIndex(3), 0.0 + 2.0 MOI.VariableIndex(1))
168+
169+
julia> MOI.Nonlinear.SymbolicAD.variables(f)
170+
2-element Vector{MathOptInterface.VariableIndex}:
171+
MOI.VariableIndex(3)
172+
MOI.VariableIndex(1)
173+
```
174+
175+
Note that this list is not sorted, but it is guaranteed to contain only unique
176+
items.
177+
178+
## `derivative`
179+
180+
Use [`Nonlinear.SymbolicAD.derivative`](@ref) to compute the symbolic derivative
181+
of a function with respect to a decision variable:
182+
183+
```jldoctest
184+
julia> x = MOI.VariableIndex(1)
185+
MOI.VariableIndex(1)
186+
187+
julia> f = MOI.ScalarNonlinearFunction(:sin, Any[x])
188+
sin(MOI.VariableIndex(1))
189+
190+
julia> MOI.Nonlinear.SymbolicAD.derivative(f, x)
191+
*(cos(MOI.VariableIndex(1)), (true))
192+
```
193+
194+
Note that the resultant expression can often be simplified. Thus, in most cases
195+
you should call [`Nonlinear.SymbolicAD.simplify`](@ref) on the expression before
196+
using it in other places:
197+
198+
```jldoctest
199+
julia> x = MOI.VariableIndex(1)
200+
MOI.VariableIndex(1)
201+
202+
julia> f = MOI.ScalarNonlinearFunction(:sin, Any[x])
203+
sin(MOI.VariableIndex(1))
204+
205+
julia> df_dx = MOI.Nonlinear.SymbolicAD.derivative(f, x)
206+
*(cos(MOI.VariableIndex(1)), (true))
207+
208+
julia> MOI.Nonlinear.SymbolicAD.simplify!(df_dx)
209+
cos(MOI.VariableIndex(1))
210+
```
211+
212+
## `gradient_and_hessian`
213+
214+
In some cases, you may want to compute the gradient and (sparse) Hessian matrix
215+
of a function. One way to achieve this is by recursively calling
216+
[`Nonlinear.SymbolicAD.derivative`](@ref) on the result of [`Nonlinear.SymbolicAD.derivative`](@ref)
217+
for each variable in the list of [`Nonlinear.SymbolicAD.variables`](@ref). But,
218+
to simplify the process, you should use [`Nonlinear.SymbolicAD.gradient_and_hessian`](@ref):
219+
220+
```jldoctest
221+
julia> x = MOI.VariableIndex.(1:2);
222+
223+
julia> f = MOI.ScalarNonlinearFunction(:sin, Any[1.0 * x[1] + 2.0 * x[2]])
224+
sin(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2))
225+
226+
julia> y, ∇f, H, ∇²f = MOI.Nonlinear.SymbolicAD.gradient_and_hessian(f);
227+
228+
julia> y
229+
2-element Vector{MathOptInterface.VariableIndex}:
230+
MOI.VariableIndex(1)
231+
MOI.VariableIndex(2)
232+
233+
julia> ∇f
234+
2-element Vector{Any}:
235+
cos(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2))
236+
*(cos(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2)), 2.0)
237+
238+
julia> H
239+
3-element Vector{Tuple{Int64, Int64}}:
240+
(1, 1)
241+
(1, 2)
242+
(2, 2)
243+
244+
julia> ∇²f
245+
3-element Vector{Any}:
246+
-(sin(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2)))
247+
*(-(sin(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2))), 2.0)
248+
*(-(sin(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2))), 4.0)
249+
```
250+
251+
where:
252+
253+
* `y` is the list of variables that appear in `f`
254+
* `∇f` is the first partial derivatives of `f` with respect to each variable in
255+
`y`
256+
* `H` and `∇²f` form a sparse Hessian matrix, were `H` is the `(row, column)`
257+
index of each element, and the corresponding element in `∇²f` is the second
258+
partial derivative of `f` with respenct to `y[row]` and `y[column]`.
259+
260+
Unlike [`Nonlinear.SymbolicAD.derivative`](@ref), the gradient and Hessian
261+
expressions have already been simplified; you do not need to call
262+
[`Nonlinear.SymbolicAD.simplify`](@ref).

0 commit comments

Comments
 (0)