Skip to content

[installer] make sure dashboard is deployed after server and papi-server #19042

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Nov 14, 2023

Conversation

mustard-mh
Copy link
Contributor

@mustard-mh mustard-mh commented Nov 9, 2023

Description

Summary generated by Copilot

🤖 Generated by Copilot at 8857387

This pull request adds a feature flag mechanism to the service-waiter component using the ConfigCat SDK, and uses it to control the waiting for the public-api-server and the server components in the dashboard deployment. It also adds the component name as a custom attribute for experiments. The changes affect the components/common-go/experiments, components/service-waiter/cmd, and install/installer/pkg/components/dashboard packages.

Related Issue(s)

Fixes EXP-822

How to test

  • Open PR with Gitpod
  • Change anything in server i.e. add a console.log
  • Exec leeway dev:preview pods should deployed successfully
  • Change anything in public-api-server i.e. fmt.Println
  • Exec leeway dev:preview pods should deployed successfully

Control with Feature Flag

  • Change Feature Flag service_waiter_skip_components value to skip true
  • It should skip all components' service-waiter
  • With attribute component set with server or public-api-server or server,public-api-server, it should skip target service waiter

Documentation

Preview status

Gitpod was successfully deployed to your preview environment.

Build Options

Build
  • /werft with-werft
    Run the build with werft instead of GHA
  • leeway-no-cache
  • /werft no-test
    Run Leeway with --dont-test
Publish
  • /werft publish-to-npm
  • /werft publish-to-jb-marketplace
Installer
  • analytics=segment
  • with-dedicated-emulation
  • workspace-feature-flags
    Add desired feature flags to the end of the line above, space separated
Preview Environment / Integration Tests
  • /werft with-local-preview
    If enabled this will build install/preview
  • /werft with-preview
  • /werft with-large-vm
  • /werft with-gce-vm
    If enabled this will create the environment on GCE infra
  • with-integration-tests=all
    Valid options are all, workspace, webapp, ide, jetbrains, vscode, ssh. If enabled, with-preview and with-large-vm will be enabled.
  • with-monitoring

/hold

@roboquat roboquat added size/L and removed size/M labels Nov 9, 2023
@mustard-mh mustard-mh marked this pull request as ready for review November 9, 2023 10:10
@@ -46,6 +48,11 @@ var componentCmd = &cobra.Command{
ctx, cancel := context.WithTimeout(cmd.Context(), timeout)
defer cancel()

if shouldSkipComponentWaiter(ctx) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not we call it each time inside the waitPodsImage loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They behavior the same

@roboquat roboquat added size/XL and removed size/L labels Nov 10, 2023
func startWaitFeatureFlag(ctx context.Context, timeout time.Duration) {
featureFlagCtx, cancel := context.WithTimeout(ctx, timeout)
defer cancel()
client := experiments.NewClient(experiments.WithDefaultClient(nil))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need default client to nil, since it will never be nil without it -> it will return a client NewAlwaysReturningDefaultValueClient. Then self-hosted Gitpod Instance may need to keep wait for it

Copy link
Member

@akosyakov akosyakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed that to land it the metric should be added. We deploy it with false state, then we enable it on dogfood, then on Cloud, and afterwards for other cells.

pre-approved since we talked about it, but i did not try and metric should be added

/hold

@mustard-mh
Copy link
Contributor Author

mustard-mh commented Nov 13, 2023

@akosyakov Force push changes into 198d71d

Could you take a look? It may happen in preview env that pods ready too fast so there's no related metrics for them

@mustard-mh
Copy link
Contributor Author

mustard-mh commented Nov 13, 2023

Checked with preview env, that metric works. It took me a lot of time to debug with service-waiter. Configcat config fetch seems will failed for the start.

🥲 And I just recognized that we forget the very first idea to wait: if not skip, we need to keep fetch feature flag. In case its value is changed.

Addressed with 198d71d , metric test result see below

$ kubectl exec -it <ide_metrics_pod_name> sh
/ # apk add curl
/ # curl http://localhost:9500/metrics | grep service_waiter
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3690    0  3690    0     0  3673k      0 --:--:-- --:--:-- --:--:-- 3603k
# HELP service_waiter_skip_components_result_total Total number of wait result of service_waiter/component service_waiter_skip_components flag
# TYPE service_waiter_skip_components_result_total counter
service_waiter_skip_components_result_total{ok="true",value="false"} 3
/ # 

@mustard-mh
Copy link
Contributor Author

mustard-mh commented Nov 13, 2023

Tested again with deploy server / public-api with leeway run dev:preview we are good to go if metric 198d71d is alright @akosyakov

@akosyakov
Copy link
Member

@geropl Could you help reviewing it?

@akosyakov
Copy link
Member

Could you take a look? It may happen in preview env that pods ready too fast so there's no related metrics for them

Do you mean first deployment or each time?

@geropl
Copy link
Member

geropl commented Nov 14, 2023

Will test now, too!

@mustard-mh
Copy link
Contributor Author

Could you take a look? It may happen in preview env that pods ready too fast so there's no related metrics for them

Do you mean first deployment or each time?

@akosyakov It'll not affect component deployment, but we don't have metric in this case, since feature flag's value is not got yet.

return defaultValue, false, fetchTimes
default:
stringValue := client.GetStringValue(ctx, experiments.ServiceWaiterSkipComponentsFlag, "NONE", experiments.Attributes{
GitpodHost: componentCmdOpt.gitpodHost,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧡

fetchTimes int
}{
{
name: "happy path",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧡

getShouldSkipComponentWaiter := func() {
startTime := time.Now()
value, isActualValue, fetchTimes := ActualWaitFeatureFlag(featureFlagCtx, client, defaultSkip)
avgTime := time.Since(startTime) / time.Duration(fetchTimes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mustard-mh When waiting for longer than timeout, I got a integer divide by zero here (I scaled server down to 0 replicas to test):

panic: runtime error: integer divide by zero

goroutine 18 [running]:
github.com/gitpod-io/gitpod/service-waiter/cmd.startWaitFeatureFlag.func1()
        github.com/gitpod-io/gitpod/service-waiter/cmd/component.go:168 +0x4e5
github.com/gitpod-io/gitpod/service-waiter/cmd.startWaitFeatureFlag({0x1a40810?, 0xc000434c40?}, 0x0?)
        github.com/gitpod-io/gitpod/service-waiter/cmd/component.go:174 +0x1be
created by github.com/gitpod-io/gitpod/service-waiter/cmd.glob..func2 in goroutine 1
        github.com/gitpod-io/gitpod/service-waiter/cmd/component.go:56 +0x45b

@geropl
Copy link
Member

geropl commented Nov 14, 2023

Finished testing:

  • happy path: ✔️
  • skip with feature flag: ✔️
  • skip after timeout: got an error (see comment)

I double checked the code and tests, looks really good! 🙏 Once the last thing is fixed we should 🏁 and merge. 🚀

@mustard-mh
Copy link
Contributor Author

Addressed, thank you 🚢

@mustard-mh
Copy link
Contributor Author

/unhold

@roboquat roboquat merged commit a5021da into main Nov 14, 2023
@roboquat roboquat deleted the hw/EXP-822 branch November 14, 2023 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants