New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Docker Image v1 Spec Documention #9560
Conversation
From issue #9538:
Well, here it is 🐋 @SvenDowideit @fredlf Please review and give feedback. |
3997360
to
82f5917
Compare
😍 |
also @vbatts @dmp42 @crosbymichael @tianon @jfrazelle @unclejack @docker/distribution-trust @nathanleclaire @cpuguy83 @huslage and anyone else in the community that comes across this - please read through it and comment on anything that isn't clear or anything that requires more explanation, keeping in mind that this is not a new specification but is only documentation of how images are currently create/formatted in Docker. I was thinking we could also generate a list of 'issues' with this specification to include in the bottom - something that could help us drive design of the next major version of the specification. Here are a few things I can think of for example:
|
also @metalivedev ;-) |
|
||
The execution parameters which should be used as a base when running a container using the image. | ||
|
||
<h4>Container RunConfig Field Descriptions</h4> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This container config has been a strong point of confusion for me and several others. As far as I can tell:
- This provide defaults values for settings if not specified at run time (e.g:
CpuShares
) - Some of these settings are completely ignored (e.g.:
Tty
,Attach*
, ...) - As far as I understand, this whole idea of "default container config" is out of v2 image format so although this is a purely v1 documentation, don't you think it might be relevant to add a "deprecation warning"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good points @icecrime
I mentioned above:
- is every field of the
runconfig.Config
struct necessary or useful?
I can update this document to clarify this - only I'm not entirely sure which fields are ignored. I guess I can dig up the code to find out exactly what's going on: https://github.com/docker/docker/blob/58ce0146e16e2e63b7a94d34a48722a9c7400c18/daemon/daemon.go#L418
Do you happen to know which fields are used? @erikh I think you have some expertise with runconfig, could you shed any light on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's all in runconfig.Merge
. Fields used:
Cmd
CpuShares
Entrypoint
Env
ExposedPorts
(and its legacy counterpartPortSpecs
)Memory
MemorySwap
User
Volumes
WorkingDir
Deleted: /etc/my-app-config | ||
``` | ||
|
||
It then creates a Tar Archive which contains *only* this changeset: The added and modified files in their entirety, and for each deleted item it creates an entry for an empty file at the same location but prefixes the basename of the file with `.wh.`. These `.wh.` prefixed files are known as whiteout files. The resulting Tar archive for `f60c56784b83` has the following entries: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested: "The filenames prefixed with .wh.
are known as "whiteout" files."
Is the name the only indication of the special nature of these files? That is, if I had a file named .wh.somename
actually in my tree, would the file be unpacked to the layer? I'm kind of hoping there is some permissions bit or something set that together with the name means it is a special file.
👍 this is great to see! |
fcaaef4
to
e073386
Compare
I've just pushed a major update to the draft spec. Please review again if you already have! |
Layer | ||
</dt> | ||
<dd> | ||
Refers to either one or both of the JSON metadata and filesystem changes for a single link in a chain of layers that make up a complete image. To refer to either specifically, one may use the terms `Image/Layer JSON` or `Image/Layer Metadata` to refer to its JSON metadata and `Image/Layer Filesystem Changeset` or `Image/Layer Diff` to refer to the set of filesystem changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this pretty hard to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it'd be okay to just delete the second sentence of this paragraph? I realize I probably went a little crazy in the second sentence... we really should agree on some common terminology though. It's a bit confusing to have a single term used loosely to refer to multiple things :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we could force a definition and make sure we use that everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with the first sentence definition if everyone else is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would phrase it something like this:
Images are composed of "layers". "Image layer" is a general term which may be used to refer to one or both of the following:
- The metadata for the layer, described in the JSON format
- The filesystem changes described by a layer
To refer to the former specifically, the terms "Layer JSON" or "Layer Metadata" are frequently used.
To refer to the latter, the terms "Image Filesystem Changeset" or "Image Diff" are frequently used.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good to me, @nathanleclaire I'll update it.
Image ID | ||
</dt> | ||
<dd> | ||
The randomly generated ID given to an image or image layer upon its creation. It is represented as a hexidecimal encoding of 256 bits, e.g., `a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"image or image layer" -- this seems odd. Do images and image layers have IDs in different namespaces?
Need the image ID necessarily be random, or is the assertion simply that it need not have semantic meaning?
If random, must the ID be from a CSPRNG?
cc @jamtur01 would be great to get your input on this |
Image Filesystem Changeset | ||
</dt> | ||
<dd> | ||
An archive of the new or changed files and directories which a layer of an image has. This archive also contains special "whiteout" files, which have names beginning with `.wh.`, which describe that that file or directory has been deleted from its parent image's filesystem. These archives can be made trivially by a layer-based/union filesystem such as AUFS or OverlayFS or by computing the diff of two directories (one corresponding to a snapshot of the parent image's filesystem and the other the current image's filesystem). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like any description of whiteout files and their semantics is going to #include the implementation details of a specific version of aufs, with specific config/compilation flags, along with the flags Docker invokes it with. For example, this paragraph from the aufs documentation:
The whiteout is for hiding files on lower branches. Also it is applied to stop readdir going lower branches. The latter case is called ’opaque directory.’ Any whiteout is an empty file, it means whiteout is just an mark. In the case of hiding lower files, the name of whiteout is ’.wh..’ And in the case of stopping readdir, the name is ’.wh..wh..opq’ or ’.wh.__dir_opaque.’ The name depends upon your compile configuration CONFIG_AUFS_COMPAT. All whiteouts are hardlinked, including ’/.wh..wh.aufs.’
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rm the bit about whiteout files (cover it later) and phrase like:
An archive of the files which have been added, changed, or deleted in an image layer. Using a layer-based or union filesystem such as AUFS, or by computing the diff from filesystem snapshots, the filesystem changeset can be used to present a series of image layers as if it were one cohesive filesystem.
Env <code>array of strings</code> | ||
</dt> | ||
<dd> | ||
Entries are in the format of <code>VARNAME="var value"</code>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these double-quotes normative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably shouldn't have included the quotes in this example. I believe the way this value should be interpreted is that the substring before the first =
is the variable name and everything after is the value. In Docker, I think this is passed directly to the execution driver in this format. @crosbymichael could you clarify this for us please?
<dd> | ||
The username or UID which the process in the container should | ||
run as. This acts as a default value to use when the value is | ||
not specified when creating a container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the following are valid:
user
uid
user:group
uid:gid
uid:group
user:gid
If group
/gid
is not specified, the default group and supplementary groups of the given user
/uid
in /etc/passwd
from the container are applied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @tianon !
lgtm. No doubt there are still nits to be picked, but this is a great step forward in unambiguously-described behavior. Thanks for your time and effort in compiling it. |
@jlhawn Any more edits remaining? There is no reason for shykes to look at this if you are just documenting the reality of the current system. |
Correct, if we're documenting today's design don't feel obligated to wait for my +1 |
@crosbymichael I think I just need to add in the |
This is brilliant. Thanks @jlhawn. I'm really glad we're taking steps towards specifying how Docker works. Perhaps we could this in for Docker 1.5 and shout about it a bit. ^_^ |
I don't know what we are waiting on @jlhawn |
Maybe just a squash of commits? |
Many iterations have gone into documenting a v1 specification of Docker's Image format. v1 Image spec: clarify parent field - metalivedev pointed out that the description was ambiguous, so I've removed mention that it was randomly generated. It IS the ID of the parent image. Updated v1 image specificatino documentation - More complete details and deprication notifications for each field in the JSON metadata of an image. - Details on the format for packaging combined Image JSON + Filesystem Changeset archives for all layers of an image. Clarify description of an image "Layer" in v1 spec Updated intro of image v1 spec Updated image v1 spec after more review - Removed description of "Image" from the terminology section. The entire document is meant to serve this purpose. - Updated the definition of "Image Filesystem Changeset". - Clarified the level of randomness needed for generating image IDs. - Updated the description of "Image Checksum". - Added term descriptions for "Repository" and "Tag" - Removed extraneous/implementation-specific fields from the Image JSON example file and field descriptions: - removed "container_config" and "docker_version" fields. - Added missing "author" field example and description. - Removed extraneous/implementation-specific fields from the "config" struct example and description: - removed "Hostname", "Domainname", "Cpuset", "AttachStdin", "AttachStdout", "AttachStderr", "PortSpecs", "Tty", "OpenStdin", "StdinOnce", "Image", "NetworkDisabled", and "OnBuild". - Updated example Image JSON config with better example values for "Env", "Cmd", "Volumes", "WorkingDir", "Entrypoint", "CpuShares", "Memory", "MemorySwap", and "User". - Added notices that any fields not specified are to be considered as implementation specific and should be ignored my implementations which are unable to interpret them. - Updated example of creating layer filesystem changesets to use less formal language. - Listed more details in the section regarding extraction of a bundle of image layers into the root filesystem of a container. - Updated the closing mention of Docker as an evolving implementation. More updates to the v1 image spec - Added line wrapping after 80 columns per line to adhere to documentation style guides, as pointed out by @jamtur01 - Removed references to any specific docker commands, updated a few descriptions or drop repeated statements, as pointed out by @cpuguy83 Cleanup image v1 spec draft after fredlf comments Address comments by mmdriley on v1 image spec Improve description of image v1 spec 'config.User` - Improves description of image v1 specification for the 'User' runtime parameter after recomendations by tianon. Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
@jfrazelle All squashed! |
Adds Docker Image v1 Spec Documention
awesome! |
Thanks @jlhawn! |
I was talking to @shykes about this at the dockercon in amsterdam...great job guys! |
Docker-DCO-1.1-Signed-off-by: Josh Hawn josh.hawn@docker.com (github: jlhawn)