Skip to content

Commit 19e5617

Browse files
authored
docs: explain language insights and config mechanisms (#280)
This PR updates the documentation as a follow-up to https://github.com/sourcegraph/sourcegraph/pull/62011 where we introduced new environment variables and improved the performance of language stats insights.
1 parent ef3886a commit 19e5617

File tree

5 files changed

+67
-20
lines changed

5 files changed

+67
-20
lines changed

docs/code_insights/explanations/administration_and_security_of_code_insights.mdx

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,3 +59,39 @@ The following setting(s) apply to adding new data to a previously backfilled Cod
5959

6060
The following setting(s) apply to both backfilling data and adding new data
6161
- `insights.query.worker.rateLimit` - Maximum number of Code Insights queries initiated per second on a worker node.
62+
63+
## Language Stats Performance Configuration (>= May 2024 Release)
64+
65+
To create language stats for a repository the Sourcegraph instance analyzes all files from the target repository. Depending
66+
on the repository's size this can take from a few seconds to multiple minutes. If you need to analyze a repository that's
67+
larger than 10GB feel free to reach out to Sourcegraph support.
68+
69+
If a query completes the result is stored in an API-level cache and subsequent queries complete within one second. If the query
70+
does not complete in time, an internal cache is stays partially populated. This cache will be reused on subsequent queries which
71+
then have less work to do and may resolve faster.
72+
73+
With the May 2024 Release we increased the default number of concurrent requests to the gitserver from 1 to 4, and raised
74+
the timeout for each language stats query from 3 minutes to 5 minutes.
75+
76+
### Concurrent Requests
77+
78+
This concurrent requests to the gitserver are configurable through the `GET_INVENTORY_GIT_SERVER_CONCURRENCY` environment variable ([#62011](https://github.com/sourcegraph/sourcegraph/pull/62011)).
79+
We recommend increasing this carefully, as an increase in concurrency may cause the gitserver to become overloaded and slow down responses.
80+
81+
Example:
82+
83+
```
84+
GET_INVENTORY_GIT_SERVER_CONCURRENCY=4
85+
```
86+
87+
To understand how this configuration impacts your language stats queries you can use [tracing](/admin/observability/tracing).
88+
89+
### Language Stats Timeout
90+
91+
The timeout in minutes for language stats queries is configurable through the `GET_INVENTORY_TIMEOUT` environment variable ([#62011](https://github.com/sourcegraph/sourcegraph/pull/62011)).
92+
93+
Example:
94+
95+
```
96+
GET_INVENTORY_TIMEOUT=5
97+
```

docs/code_insights/explanations/current_limitations_of_code_insights.mdx

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,21 @@ To accurately return historical data for insights running over many repositories
2424

2525
The number of insights you have does not affect the overall speed at which they run: it will take the same total time to run all of them whether or not you let each one finish before creating the next one. As of version 4.4.0 Insights prioritize completing the backfills for the insights that will complete the fastest. In general this means that insights over many repositories will pause to allow insights over a few repositories to complete.
2626

27+
## Creating language insights for a very large repository
28+
29+
> NOTE: This applies to Sourcegraph versions greater than `5.3`
30+
31+
Similar to [insights in general](#creating-insights-over-very-large-repositories), creating a language insight over a very large repository can be slow.
32+
33+
Language insights become faster as the internal cache populates, but depending on your Sourcegraph instance and repository size this may take a few attempts.
34+
35+
By default the dashboard attempts three queries that take up to 5 minutes. It will automatically retry until the three attempts are exhausted.
36+
37+
Apart from waiting and retrying you may also reach out to your Sourcegraph administrator to [increase the number of concurrent queries or increase the timeout for the query](/code_insights/explanations/administration_and_security_of_code_insights).
2738

2839
## Creating insights over very large repositories
2940

30-
> NOTE: The feature applies on Sourcegraph version graeter than `3.42`
41+
> NOTE: The feature applies on Sourcegraph version greater than `3.42`
3142
3243
In some cases, depending on the size of the Sourcegraph instance and the size of the repo, you may see odd behavior or timeout errors if you try to create a code insight running over a single large repository. In this case, it's best to try:
3344

docs/code_insights/how-tos/Troubleshooting.mdx

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -23,16 +23,16 @@ scale may be responsible.
2323
3. (admin-only) Check the queries currently in background processing using the GraphQL query
2424

2525
``` gql
26-
query seriesStatus {
26+
query seriesStatus {
2727
insightSeriesQueryStatus {
28-
seriesId
29-
query
30-
enabled
31-
completed
32-
errored
33-
processing
34-
failed
35-
queued
28+
seriesId
29+
query
30+
enabled
31+
completed
32+
errored
33+
processing
34+
failed
35+
queued
3636
}
3737
}
3838
```

docs/code_insights/language_insight_quickstart.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ If you are more interested tracking the historical or future result count of an
2424

2525
### 3. Once on the "Set up new language usage insight" form fields page, enter the repository you want to analyze.
2626

27-
Enter repositories in the repository URL format, like `github.com/Sourcegraph/Sourcegraph`.
27+
Enter repositories in the repository URL format, like `github.com/sourcegraph/docs`. We recommend a small repository so that you get a quick result.
2828

2929
The form field will validate that you've entered the repository correctly.
3030

docs/versioned/5.2/code_insights/how-tos/Troubleshooting.mdx

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -23,16 +23,16 @@ scale may be responsible.
2323
3. (admin-only) Check the queries currently in background processing using the GraphQL query
2424

2525
``` gql
26-
query seriesStatus {
26+
query seriesStatus {
2727
insightSeriesQueryStatus {
28-
seriesId
29-
query
30-
enabled
31-
completed
32-
errored
33-
processing
34-
failed
35-
queued
28+
seriesId
29+
query
30+
enabled
31+
completed
32+
errored
33+
processing
34+
failed
35+
queued
3636
}
3737
}
3838
```

0 commit comments

Comments
 (0)