Curious but Lazy: August 2023

Git, the currently popular versioning system is a pain to understand for people who come from CVS/Subversion, and are used to having just a central repo and local working directory.
Since Git is a distributed repository system, each developer has his own repository, that keeps track of local changes, as well as can sync with one or multiple remote repositories. Hence the concepts of Git are rather different from SVN, and the same terms like checkout mean different things in SVN and GIT. Being an SVNer earlier, i too found Git frustrating to start with.

Atlassian has a good tutorial that explains these concepts :
https://www.atlassian.com/git/tutorials/learn-git-with-bitbucket-cloud

Some important concepts :

The local repository is a full fledged repository, and independently holds a history of its own branches, commits etc. One could create and work with a local repository, without ever connecting to a remote repository, say for some private work that is not shared.
We usually start by cloning a remote repository to a local one. See clone. This creates a copy of the remote repository to the local, as well as adds a reference to the remote repository usually as "origin" to the local repo's list of remotes( See the remote command). Thus, when syncing with the original repository, we can specify it as "origin".
The checkout command does NOT checkout files from remote. Instead it checks out the specified branch to the working directory. i.e. that branch now becomes the current one.
GIT has a staging area where the changes to be committed are kept. We specify which files are to be staged by using the add command. Without adding, the commit command will have nothing to work on.
When you push your changes to the remote, you are syncing changes committed in your local repo to the remote one. So you must have committed them first to your local repo. This may not be obvious to SVNers, who would expect the locally updated files to be automatically pushed !
TODO : See the branch command
Unlike SVN, we do not need separate working folders for branches. We can switch to another branch in the same working folder, using the checkout command. This means that uncommitted changes can be lost, unless we stash them to a temp location.

Coming from a build tool like ANT, it can be pretty frustrating for developers to understand maven. It seems to be doing too many unspecified things, too rigidly. This article tries to understand maven from that perspective.

So here's a quick summary :

Maven executes Goals, just like Ant has targets.
In addition, maven has Phases, each phase being a list of goals. We can also tell maven to execute one or more phases, and then each phase will execute the goals grouped under it. Phases themselves have an order, so that if we execute a given phase, the phases coming before that in the predefined order will be executed first.
The default phases are validate, compile, test, package, verify, install, deploy.
Each maven execution run happens in a lifecycle, which has phases under it. The default lifecycles are default, clean and site.
We can create and define our custom goals. These are packaged in a Plugin.
Maven also allows us to specify and manage dependencies of the project.
Maven resources have a groupId, artifactId and version.
Maven provides for a repository to fetch/store the artifacts. A central maven repository is the default .Also, maven keeps a local repository on the system where its run, to avoid fetching from the remote repository each time.

Read these links first for a basic understanding :
https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html

In ANT, we specify tasks and their dependencies. We can execute a specific task, and only that, along with its specified dependent tasks will be executed. But this also means that tasks like clean, compile, generate, copy, package, install, deploy etc have be specified in details for all projects, along with config like source, staging, target dirs etc. Also external dependencies like jars have to be managed, along with their versions. These may be common amongst multiple projects. Should we check these into source control, or have a separate common location ? What if we want to add a common functionality to all builds ? How do we name and version output artifacts ?

Maven tries to answer these questions. It tries to provide a standard way to execute projects, by promoting convention over configuration :

Standardized project directory structure. e.g. The src/main/java, src/main/resources src/test etc
Naming conventions for articfacts using groupid, artifactid, version
Providing a dependency configuration, and a repository mechanism to store and access needed dependencies
Out-of-the box implementation of standard build lifecycle, so that a project can be build with minimum configuration
An inheritance mechanism, so that a build may be shared amongst multiple projects, each overriding only the parts needed.
Profiles to have different builds for different case, e.g we might just need compile in dev mode, and the full jar with dependencies may have to be built in production mode.

So if we have a simple java project with no dependencies, then a minimal pom with just the groupid, artifactid and version id will be able to build the project from clean to deploy.

Maven works with lifecyles, phases and goals. It has default implementations of these. We can also create our own lifecycles, which is a list of phases, in order. Then there are goals, that execute in a particular phase which do the actual work. e.g. in the default lifecycle, the install phase has a goal install:install.

Goals are usually implemented using java classes called plugins. When creating plugins, we can specify what phase the plugin should execute, tho its not mandatory. The phase in which a plugin executes can also be specified via the build configuration, and this will override the default.

What can we execute with the maven command ?

A list of mixture of phases and goals. Whatever phases are specified, all the phases before them in the life cycle will be implicitly executed.

However, thats not the case with goals, only the specified goals will be executed.

e.g. we can execute just the install goal using mvn instal:install, instal:install being the name of the goal in the install phase. Earlier phases will not be run. However, it can cause problems like in this case, since this goal looks for the anme of the output jar to install in the execution context, which is missing. In this case, it can be remedied by also installing the jar:jar goal before, so that it gets the jar name.

What does the maven build file consist of ?

The groupid, artifactid and version id of the project being built, this the minimal info needed.
The list of dependencies needed by the project.
The list of dependency repositories, if using any other than the standard maven one.
The list of repositories if any, to deploy/publish the final artifact( usually jar, war etc)
The build section to use and configure non-standard maven plugins, in which phase and with what params are they to run. Similarly to customise/change execution of the standard maven plugins.
A profiles

Curious but Lazy

Tuesday, August 22, 2023

Git for Svners

Maven for ANTers

What can we execute with the maven command ?

What does the maven build file consist of ?