feat: add ruff and resolve issue #2262

drbh · 2024-07-19T16:04:10Z

This PR adds ruff linting to TGI and adds ruff to the pre-commit hook

Narsil · 2024-07-23T09:21:33Z

I'm fine moving to ruff. But seeing integration tests failure means something very wrong has happened.

Narsil

Thanks a lot.

A few places I'm confused by the changes asked by ruff.
We could merge as-is if necessary, mostly nits.

Narsil · 2024-07-25T12:48:03Z

server/text_generation_server/layers/fp8.py

+
+def is_fbgemm_gpu_available():
+    try:
+        return importlib.util.find_spec("fbgemm_gpu.experimental.gen_ai") is not None


What's the rationale behind this change ?

I find it must less clear in intent. If ruff complains there must be a good reason behind it.

this is a lint from Pyflakes for unused-import https://docs.astral.sh/ruff/rules/unused-import/

Narsil · 2024-07-25T12:48:48Z

server/text_generation_server/layers/gptq/utils.py

@@ -0,0 +1,56 @@
+import torch


Where is this file coming from ?

this resolve a Pyflakes issue of undefined-name for the torch_snr_error function used in server/text_generation_server/layers/gptq/quantize.py. This file adds torch_snr_error as copied from openppl-public/ppq

Narsil · 2024-07-25T12:49:18Z

server/text_generation_server/layers/rotary.py

@@ -2,12 +2,13 @@
 import math
 import torch
 from torch import nn
-from loguru import logger
+
+# Inverse dim formula to find dim based on number of rotations


ooo stray comment. removed, thanks!

Narsil · 2024-07-25T12:49:51Z

server/text_generation_server/layers/tensor_parallel.py

@@ -69,12 +69,12 @@ def load(config, prefix: str, weights):

        # GPTQ,AWQ,EETQ don't quantize heads (nor embeddings)
        if config.quantize in ["gptq", "awq", "eetq", "marlin"]:
-            quantize = None
+            pass


This feels wrong. We really want to set quantize=None in each of these I think.

this is a lint for unused-variable as quantize is set but never used in this function.

reverted the change but i'm not sure if this check is needed..

@danieldk is there a case we need the quantize variable?

Narsil · 2024-07-25T12:52:15Z

server/text_generation_server/models/__init__.py

@@ -523,7 +528,7 @@ def get_model(
                dtype=dtype,
                trust_remote_code=trust_remote_code,
            )
-    elif model_type == MAMBA:
+    elif model_type == ModelType.MAMBA:


I'd rather keep the global hack personally.

This is just unecessary indirection to me. Pre-existing code is also indirection compared to pure string comparison, but the reason was to make sure we keep documenting supported models correctly.

this is related to the undefined-name lint, since ruff could not determine that the variables exist.

I agree this adds unnecessary complexity, just reverted to prefer the global variables and added # ruff: noqa: F821 to the top of the file to avoid checking this specific lint.

Narsil · 2024-07-25T12:55:40Z

server/text_generation_server/models/causal_lm.py

@@ -233,7 +233,7 @@ def filter(self, request_ids: List[int]) -> Optional["CausalLMBatch"]:
        ]

        # Ensure that past_key_values tensors can be updated in-place
-        if type(self.past_key_values[0]) == tuple:
+        if isinstance(self.past_key_values[0], tuple):


No.
type(xxX) == tuple was definitely intended here, ti's not a rookie mistake.
It's because xxx can be an instance of tuple, yet not a tuple.

I don't remember specificially the issue at hand, but I'd rather trust our past selves than the linter here.

ahhh great catch!!

It seems that a variable can be an instance of a tuple via inheritance which would return True, even when its not a base tuple

example repro below

class CustomTuple(tuple): pass ct = CustomTuple([1, 2, 3]) print(type(ct) == tuple) # False print(isinstance(ct, tuple)) # True

reverted isinstance changes and opted to replace the == with is to make the linter happy.

**This should be equivalent, is checks for object identity where == checks for value, and in the case of built-in types this will be the same.

Narsil · 2024-07-25T12:56:55Z

server/text_generation_server/models/custom_modeling/flash_deepseek_v2_modeling.py

@@ -39,6 +39,12 @@
 from transformers.activations import ACT2FN
 from transformers.configuration_utils import PretrainedConfig

+if SYSTEM == "rocm":
+    try:
+        from vllm import _custom_C


Where is this coming from ?

This doesn'tseem to be linked to ruff.

this is related to a undefined-name lint, _custom_C is used in the DeepseekV2MLP.forward call but was not imported/defined

Narsil · 2024-07-25T12:57:47Z

server/text_generation_server/models/custom_modeling/t5_modeling.py

@@ -45,6 +45,15 @@
    SpeculativeHead,
 )

+# copied from https://github.com/huggingface/transformers/blob/cd4584e3c809bb9e1392ccd3fe38b40daba5519a/src/transformers/models/t5/modeling_t5.py#L1316
+# Warning message for FutureWarning: head_mask was separated into two input args - head_mask, decoder_head_mask
+__HEAD_MASK_WARNING_MSG = """


???? Why do we need this ?

this is also a undefined-name lint, __HEAD_MASK_WARNING_MSG is called as a warning in T5ForConditionalGeneration.forward however I'm not sure we need this warning... leaving for now but happy to remove

drbh force-pushed the integrate-ruff-linting branch from d12f426 to 790082b Compare July 19, 2024 16:05

drbh force-pushed the integrate-ruff-linting branch 2 times, most recently from 05be40f to 634e958 Compare July 24, 2024 14:57

drbh added 6 commits July 24, 2024 15:33

feat: add ruff and resolve issue

80ab61c

fix: update client exports and adjust after rebase

7e810e7

fix: adjust syntax to avoid circular import

154cf67

fix: adjust client ruff settings

655a9d7

fix: lint and refactor import check and avoid model enum as global names

382bf59

fix: improve fbgemm_gpu check and lints

e216e53

drbh force-pushed the integrate-ruff-linting branch from 25452f6 to e216e53 Compare July 24, 2024 19:33

drbh added 2 commits July 24, 2024 19:44

fix: update lints

9bfa340

fix: prefer comparing model enum over str

72c9767

Narsil reviewed Jul 25, 2024

View reviewed changes

drbh added 2 commits July 25, 2024 14:50

fix: adjust lints and ignore specific rules

a10f401

fix: avoid unneeded quantize check

6784d5d

drbh requested a review from Narsil July 25, 2024 17:28

drbh merged commit bab02ff into main Jul 26, 2024
10 checks passed

drbh deleted the integrate-ruff-linting branch July 26, 2024 14:29

drbh mentioned this pull request Aug 12, 2024

fix: include create_exllama_buffers and set_device for exllama #2407

Merged

feat: add ruff and resolve issue #2262

feat: add ruff and resolve issue #2262

Uh oh!

Conversation

drbh commented Jul 19, 2024

Uh oh!

Narsil commented Jul 23, 2024

Uh oh!

Narsil left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!