Skip to content

Commit 1ac24ff

Browse files
committed
Disable flaky test points
It is not yet clear what causes Ollama to not be reliable in these situations, but we do see differences. Will need to analyse separately and report upstream.
1 parent 392749f commit 1ac24ff

File tree

1 file changed

+12
-2
lines changed

1 file changed

+12
-2
lines changed

tests/tollamaChat.m

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,13 @@ function extremeTopK(testCase)
5050
end
5151

5252
function extremeTfsZ(testCase)
53-
% setting tfs_z to z=0 leaves no random choice,
54-
% so we expect to get a fixed response.
53+
%% This should work, and it does on some computers. On others, Ollama
54+
%% receives the parameter, but either Ollama or llama.cpp fails to
55+
%% honor it correctly.
56+
testCase.assumeTrue(false,"disabled due to Ollama/llama.cpp not honoring parameter reliably");
57+
58+
% setting tfs_z to z=0 leaves no random choice, but degrades to
59+
% greedy sampling, so we expect to get a fixed response.
5560
chat = ollamaChat("mistral",TailFreeSamplingZ=0);
5661
prompt = "Sampling with tfs_z=0 returns a definite answer.";
5762
response1 = generate(chat,prompt);
@@ -70,6 +75,11 @@ function stopSequences(testCase)
7075
end
7176

7277
function seedFixesResult(testCase)
78+
%% This should work, and it does on some computers. On others, Ollama
79+
%% receives the parameter, but either Ollama or llama.cpp fails to
80+
%% honor it correctly.
81+
testCase.assumeTrue(false,"disabled due to Ollama/llama.cpp not honoring parameter reliably");
82+
7383
chat = ollamaChat("mistral");
7484
response1 = generate(chat,"hi",Seed=1234);
7585
response2 = generate(chat,"hi",Seed=1234);

0 commit comments

Comments
 (0)