Skip to content
This repository was archived by the owner on Sep 11, 2020. It is now read-only.

format/diff: unified diff encoder and public API #388

Merged
merged 2 commits into from
May 23, 2017
Merged

format/diff: unified diff encoder and public API #388

merged 2 commits into from
May 23, 2017

Conversation

ajnavarro
Copy link
Contributor

@ajnavarro ajnavarro commented May 10, 2017

  • Added Patch interface
  • Added a Unified Diff encoder from Patches
  • Added Change method to generate Patches
  • Added Changes method to generate Patches
  • Added Tree method to generate Patches
  • Added Commit method to generate Patches

It resolves #57

chunkInit = "@@ -%s +%s @@%s\n"
chunkCount = "%d,%d"

noFilePath = "/dev/null"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about Windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the way that unified diff uses to represent "no file". Is OS independent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in git it's hardcoded without any conditional.

@@ -42,6 +43,17 @@ func (f *File) Contents() (content string, err error) {
return buf.String(), nil
}

// IsBinary returns if the file is binary or not
func (f *File) IsBinary() (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lacks of test on this package

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@codecov
Copy link

codecov bot commented May 11, 2017

Codecov Report

Merging #388 into master will decrease coverage by 0.1%.
The diff coverage is 90.87%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #388      +/-   ##
==========================================
- Coverage   77.27%   77.17%   -0.11%     
==========================================
  Files         123      125       +2     
  Lines        8810     9085     +275     
==========================================
+ Hits         6808     7011     +203     
- Misses       1234     1311      +77     
+ Partials      768      763       -5
Impacted Files Coverage Δ
plumbing/object/change.go 100% <100%> (ø) ⬆️
plumbing/object/commit.go 69.91% <50%> (-1.39%) ⬇️
plumbing/object/tree.go 76.53% <60%> (-0.44%) ⬇️
plumbing/object/file.go 75% <66.66%> (-1.09%) ⬇️
plumbing/object/patch.go 83.72% <83.72%> (ø)
utils/binary/read.go 68.42% <86.66%> (+6.51%) ⬆️
plumbing/format/diff/unified_encoder.go 98.25% <98.25%> (ø)
plumbing/transport/ssh/common.go 0% <0%> (-50.91%) ⬇️
plumbing/transport/ssh/auth_method.go 32.4% <0%> (-24.08%) ⬇️
repository.go 72.87% <0%> (-0.02%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ff77a8...65416cf. Read the comment docs.

@ajnavarro ajnavarro changed the title [WIP] plumbing/format/diff: unified diff encoder plumbing/format/diff: unified diff encoder May 16, 2017
@ajnavarro ajnavarro requested review from smola, mcuadros and alcortesm May 16, 2017 08:22
var ErrBothFilesEmpty = errors.New("both files are empty")

// UnifiedEncoder encodes an unified diff into the provided Writer. There are some unsupported features:
// - Similarity index is not supported
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Similarity index/Similarity index for renames/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add is not supported if you just said this is a list of unsupported features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

for _, p := range patch.FilePatches() {
f, t := p.Files()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encodeFilePatch function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func (e *UnifiedEncoder) printMessage(message string) error {
isEmpty := message == ""
hasSuffix := strings.HasSuffix(message, "\n")
msg := message
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to declare this variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


func (c *hunksGenerator) Generate() []*hunk {
for i, chunk := range c.chunks {
ls := splitLines(chunk.Content())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate function?

c.beforeContext = nil
}

// Logic used to obtain line numbers in a new chunk
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested wording: addLineNumbers obtains the line numbers in a new chunk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err != nil {
return nil, err
}
toTree, err := to.Tree()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func getPatch(message string, changes ...*Change) (fdiff.Patcheable, error) {
var filePatches []fdiff.FilePatch
for _, c := range changes {
from, to, err := c.Files()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


const sniffLen = 8000

//IsBinary detects if data is a binary value based on:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces after //

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@alcortesm alcortesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the package can be improved by writing a package level documentation explaining the package itself, its purpose, design and style. Some examples will be good too.

The tests are quite awesome.

I don't know much about templates, but it looks as if the unified encoder can be done using templates. Not sure if it is a good idea though.

The explanation of the encoder should come along with a pointer to the documentation of the format.

"gopkg.in/src-d/go-git.v4/plumbing/filemode"
)

type Operation int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document this type and its values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// Patch represents the necessary steps to transform several files to another.
type Patch interface {
FilePatches() []FilePatch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document this methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// Patcheable is a Patch but with several ways to encode it. It is used to be
// able to use Patch in a fluent way.
type Patcheable interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this interface or its comments.

I think the name, Patcheable, is not transmitting the full meaning of the interface, as implementing this interface does not mean something can be Pathed.

Also, the comment states the are several ways to encode it, but there is only one method and it is not called encode or mention encoding anywhere in its documentation.

Maybe if you explain what you want to achieve here and how, it will be easier to understand. And example can also help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote the documentation, maybe is clear now. If not, can you suggest a better name for this interface?

// UnifiedDiff writes the patch into the provided writer in unified
// diff format. ctxLines changes the amount of unchanged lines that will be
// printed before and after each chunk.
UnifiedDiff(ctxLines int, w io.Writer) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the writer the first argument, a l`a golang way.

I don't understand this, shouldn't this be an encoder of Path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the documentation, I hope is clear now.
✅ to change arguments order.


// FilePatch represents the necessary steps to transform one file to another
type FilePatch interface {
IsBinary() bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is binary here, to or from or the FilePath itself, if it is the FilePath itself, then explain that there are two types of FilePatches and why.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain what will from and to be when inserting or deleting a file (Hash?, filemode? path? or it will be just nil?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

func (e *UnifiedEncoder) prints(s string) error {
_, err := e.Write([]byte(s))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this all together, it is a print method that does not print.

This should be a call to bytes.Buffer.WriteTo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
}

func (c *hunksGenerator) Generate() []*hunk {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the struct is private, to know which methods are intended to be used outside.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is usually achieved in Go using an internal package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't need to be an internal package, this is also a good practice at least for me

return buf.String()
}

func (c *hunk) AddOp(t Operation, s ...string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the struct is private, to know which methods are intended to be used outside.

@@ -78,6 +79,12 @@ func (c *Change) String() string {
return fmt.Sprintf("<Action: %s, Path: %s>", action, c.name())
}

// Patch returns a Patch with all the file changes in chunks. This
// representation can be used to create several diff outputs.
func (c *Change) Patch() (diff.Patcheable, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it make sense for Change to implement Patch or FilePatch instead of having a method that returns a Path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be, but appears better to me have the functionality separated. What do you think @smola @mcuadros ?

filePatches []fdiff.FilePatch
}

func (t textPatch) FilePatches() []fdiff.FilePatch {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document all the methods in these files. A simple ... implements <interface>. will suffice most of the times.

Copy link
Contributor Author

@ajnavarro ajnavarro May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete
)

// Patch represents the necessary steps to transform several files to another.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

English can be improved a little bit by simplifying the statement:

Patch represents a collection of steps to transform several files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🆗

// Patcheable contains ways to represent a patch. Its use is intended to create
// fluents APIs, and be able to discover more easily ways to work with Patches.
// You don't need to know that a NewUnifiedEncoder constructor exists, you just
// need to see the available methods after get a Patch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I still don't understand this interface. It looks as a Patch encoder but the name and the description suggest otherwise. Maybe it is just me, let's talk about this. Also the name of the function is specific to a format, but that format is not part of the name of the interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of this interface is do this:

p, err := commit.Patch(otherCommit)
handleErr(err)
err = p.UnifiedDiff(buf,2)
handleErr(err)

Instead of:

p, err := commit.Patch(otherCommit)
handleErr(err)
// To do this, you need to know the internals or find it into the documentation
ue := diff.NewUnifiedEncoder(buf, 1)
err = ue.Encode(p)
handleErr(err)


// Chunk is a operation to process to transform a file to another.
// Operations can be Equal, Add or Delete
type Chunk interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


func (e *UnifiedEncoder) printMessage(message string) {
isEmpty := message == ""
hasSuffix := strings.HasSuffix(message, "\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think endsInNewLine is a better name for this variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the message is a specific part of the patch format, is the very beginning of the result file. This method is not intended to write other kind of lines, just this header message if any.

}

func (c *hunk) WriteTo(buf *bytes.Buffer) {
fromString := fmt.Sprintf(chunkCount, c.fromLine, c.fromCount)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method need some improvements:

  • don't use strings, write to buf directly using fmt.Fprintf
  • don't generate the string / write to buf twice depending on c.fromCount or c.toCount, use an if ... else clause to generate the content only once.

toString = fmt.Sprintf("%d", c.toLine)
}

buf.WriteString(fmt.Sprintf(chunkInit, fromString, toString, c.ctxPrefix))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very wrong, just write to the buf with out generating the string.

}
}

func (c *hunksGenerator) Generate() []*hunk {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is usually achieved in Go using an internal package.

Copy link
Contributor

@alcortesm alcortesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version is super good. It still have some minor issues and I still don't understand the Patcheable interface, maybe we should talk about it.

@mcuadros
Copy link
Contributor

ajnavarro added 2 commits May 23, 2017 11:05
- Added Patch interface
- Added a Unified Diff encoder from Patches
- Added Change method to generate Patches
- Added Changes method to generate Patches
- Added Tree method to generate Patches
- Added Commit method to generate Patches
@ajnavarro ajnavarro changed the title plumbing/format/diff: unified diff encoder format/diff: unified diff encoder and public API May 23, 2017
@ajnavarro
Copy link
Contributor Author

@mcuadros rebased and unnecessary commits squashed. Also added diff line into compatibility file.

@mcuadros mcuadros merged commit f663a93 into src-d:master May 23, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Diff between commits
4 participants