Don’t Reinvent the Wheel - Just Use It!
Don’t Reinvent the Wheel - Just Use It!
Developing new software can be a very time consuming task. Thankfully, it is very easy to share the code and resources from previously developed software, making it very simple for large numbers of programmers to collectively work together, sometimes completely indirectly, to solve a new problem.
In this chapter, we’re going to explore software libraries and how we can take advantage of easily reusable pieces of software to make our job as programmers even easier. We’ve already used several of these libraries in our programs, but this is a good chance to step back and take a look at the broader software ecosystem and how it all fits together.
In this chapter, we’ll learn about the following concepts:
In the following chapter, we’ll learn how to use the tools we’ve already explored in this class, plus a few additional tools, in order to create our own software libraries that we can distribute based on our code. We’ll also explore how to use an external library in our ongoing project, including how to manually install one that isn’t available in a repository.
The term software library can actually mean several things. In essence, a software library is a collection of resources that can be used by computer program, either while it is running or while it is being developed. As we’ve learned so far in this course, it makes sense to think of a large software program as a few smaller packages or subsystems that work together. With that view in mind, a software library simply is a subsystem or package that is developed outside of our application. In most cases, it is meant to be reused by many different programs.
In fact, one of the major benefits of writing our software in a modular format is to enable this exact kind of code reuse . A program developed for one task may include code that could easily be repurposed for a similar task. A great example is a system for ordering food online at a restaurant. That same system includes many of the same components that would be required for any other e-commerce website, such as one that sells handmade arts and crafts. So, many portions of that software could be turned into general-purpose libraries that can be reused.
The diagram above shows an example of what it might look like for an application to use an external library to handle playing an Ogg Vorbis
file, which is an audio file format similar to an MP3. It makes no sense for use to “reinvent the wheel” for playing an Ogg Vorbis file, even though the entire format is published online. Instead, we can find a compatible library for the language we are using, such as the
libvorbisfile library shown in this diagram, and include that library in our software.
To play the file, we can call the functions in the library’s API that accomplish that task. When we do, we can provide the Ogg Vorbis file as input, and we’ll receive a decoded audio stream that we can send to yet another library that handles playing audio on our system. So, we can see these libraries as just another set of subsystems that our code interacts with in order to achieve its intended goal.
There are many different types of software libraries available. However, many modern languages such as Java and Python have greatly simplified the task of using these libraries. This is because both Java and Python rely on an underlying program to execute our code (either the Java Runtime Environment, or JRE for Java, or the Python Interpreter for Python), which handles connecting our code to the various libraries available on our system. So as developers we rarely need to worry about the differences in our code.
However, if we develop programs using languages like C or C++ that are designed to be compiled directly to executables that run on the system, these different library types become much more important.
A static library is a software library that is statically linked to our executable file when it is compiled. Linking a library involves combining our application’s code with the libraries it uses into a single executable. So, that library’s code is included directly in our application, and we don’t have to include any additional files in order for our application to function. However, that means that we’ll have to recompile our application completely each time we want to update any of the libraries that it uses. In addition, if many applications on a system all use the same library, they’ll all have to include a copy of that library in their executable, sometimes eating up valuable storage space in the process. Originally, all libraries were static libraries, but eventually developers came up with a new, more flexible system.
A shared library is a software library that can be shared among multiple executable files. Instead of being included in the application when it is compiled, the library code can be dynamically linked when the application is executed. So, when we load our program, the operating system can search for any required shared libraries on the system, load them into memory, and link them to our application so we can use them. Users of the Microsoft Windows operating system may be familiar with the DLL file type, which is the “dynamic-link library” format used by that operating system. So, a DLL file is simply a software library that is meant to be dynamically linked to an application when it executes.
The major benefit of this approach is that the library can be installed on a system just once, and then many different applications can make use of it without including that library’s code in their individual executables. In addition, the library can be updated once on a system and the new version will be available for any application that uses it.
Unfortunately, there are several downsides to this approach as well. One major issue is when applications require different versions of the same library in order to function. In that case, the operating system must maintain several versions of the same library, and if done incorrectly this could lead to one application overwriting the library required by another. In addition, this means that the space savings by sharing libraries among applications can be greatly diminished if applications are not able to share the same version of the library. In fact, there are entire articles on Wikipedia dedicated to the issues with dynamically linked libraries, a problem collectively known as Dependency Hell
As object-oriented programming became more common, many programming languages started to support class libraries as a way to share code between applications. In this case, a class library is simply a set of classes, usually either provided as source code or a compiled version of the code, which can then be integrated into another application. In this way, the library looks just like any other portion of the software, and can easily be used by developers in their applications.
A great example of a class library are the standard libraries included with many programming languages, including Java and Python. We’ve used these libraries extensively in our code, mainly to support reading to and writing from files, storing data in data structures, and even creating GUIs for our applications. Anytime we import something into our code that we didn’t write ourselves, we’re taking advantage of a class library that was written by another developer.
Going forward, when we refer to a software library in this course, we will usually be referring to a class library as described above.
One question that comes up frequently when discussing software libraries is the difference between a library and a framework, as many times the terms are used interchangeably. So, let’s briefly explore the difference between the two and how they interact with each other.
As discussed on the previous page, a library is a reusable software component that has an API that we can make use of in our code. So, our code will call methods in the API to perform the desired actions, and we’ll typically import the library’s code into our own code files.
Structurally, we can think of our code as a wrapper around the library. We’re using the library in our application, so we are in control of what it does.
On the other hand, a framework is a piece of software written to perform a specific task or be used in a specific way. However, the framework includes places where a developer can customize the code to change the actions performed by the framework. Many frameworks can be used without any customization at all, but in most cases the framework will not do anything useful without additional code added by a developer.
Some great examples of a framework are the Python Flask and Java Spring frameworks, which are both designed to create web applications. They provide the overall structure for a web application, including the routing, page templates, receiving requests from a browser and creating a response, and more. Then, the developer can customize the web application by providing code to add individual web pages, API endpoints, and databases to store data. All of the customization to make the application meet the needs of the user is handled by the developer, but the framework itself is responsible for the overall structure and operation of the application.
Put another way, a framework is a wrapper around our code. The framework is in control of what the application does, and it calls our code as needed to create the desired pages and send the correct output back to the user.
A key concept of a software framework is this inversion of control, where the program’s overall structure and operation are determined by the framework itself and not our code. As shown in the diagram above, a framework calls our code, and then our code calls code stored in libraries. That is the easiest way to spot the difference between a framework and a library.
Frameworks also make extensive use of the template method pattern that we learned about in an earlier chapter. Our code will implement parts of the template method, such as a template method for sending a web page to a web browser. We’ll provide one part of the content, and we can override other methods to customize it as needed, but the framework itself will use the template method to actually send the web page to the browser.
In a later chapter in this course, we’ll explore the Python Flask and Java Spring frameworks a bit deeper, and see how we can use them in our ongoing software project in this course to make them available via the internet.
So far, we’ve learned about what software libraries are, and how they differ from other, similar tools such as software frameworks. However, you are probably wondering: “how do I find these libraries and add them to my application?” Let’s discuss the various places you can learn about software libraries and how to use them in your applications.
One of the most common ways to find software libraries to include in your application is to review the libraries available in a repository of libraries for your language. A repository is simply a database of content that you can use, and most languages include a way to automatically find and install libraries that are available in a standard repository. Most of those libraries are provided as packages, which is simply a name for the library and any supporting files or resources all bundled together in a single downloadable file.
For example, in Python we’ve used the
pip3 tool to download and install many different tools for Python, such as
pip3 tool downloads packages from a central repository called PyPi
(Python Package Index). The PyPi website includes a very robust search tool for finding and learning about the various packages available for Python. In the next chapter, we’ll see how we can package our own applications up and make them ready for submission to PyPi.
In Java, we are using Gradle as our build tool, and it is able to download packages from many different repositories. In most cases, we will be using the Maven Central repository.
In addition to the repositories listed above, many software libraries and packages are available for download directly from the internet, usually from the library developer’s website. For example, many of the Java libraries developed by the Apache Software Foundation can also be downloaded directly from their Distribution Directory . Many Java packages are commonly offered via direct download as well as through repositories, mainly because the popularity of distributing software via a repository is more recent than the development of Java.
For Python, on the other hand, by far the most common method of installing packages is simply via the
pip3 command that downloads them directly from PyPi. However, it is possible to download these packages directly from PyPi as a Python Wheel, which we’ll learn about in the next chapter.
In both languages, the ability to download and install these packages directly is important for many reasons. There may be instances where the developer may not have direct access to the Internet, such as in a highly secure computing environment. So, tools such as
pip3 and Gradle cannot be used to download the packages.
In fact, many developers working in a secure environment can choose to host their own internal repositories for software packages, making sure that they have access to the packages they need while still being able to control the exact version and contents of those packages.
Finally, many open-source software libraries can be directly built from the source code. The vast majority of open source software today stores their source code in publicly available code repositories such as GitHub , GitLab , or SourceForge . So, a developer can choose to download the source code directly and build the library themselves, or possibly even edit the source code as needed to match a particular use case. In most open-source community, this kind of experimentation and reuse is highly encouraged.
Of course, this can present many hassles as well. Many more advanced software libraries contain thousands of lines of code, and they can be very complex to modify, build, and distribute. Most large scale open-source project has large amounts of automation that handles this process, so doing it ourselves as a single developer can be very daunting. In addition, any time the library is updated we’ll need to manually update our version as well, or else we risk out software becoming obsolete and possibly vulnerable to security issues.
When downloading or installing software libraries, one aspect that should always be considered is the security of your application. There are many instances of open source software libraries containing either security flaws or malicious code, many of which are only discovered months or years after appearing in the application. So, we must always be aware of these risks and how they can impact the overall security of the application we are building.
While there is no way to avoid all security issues, here are some quick things we should keep in mind when reviewing which software libraries to include in our code:
On the next page, we’ll dive a bit deeper into software licenses and the impact that may have on the libraries we can use in our application.
In the world of software, we use the term open-source to refer to any software that has source code that is openly available. This is in contrast to proprietary software, sometimes referred to as closed-source software, which is software with source code that is not publicly available. The software itself may be sold, or even provided for free, but the actual source code is protected by the company.
So, before we can use just any software library we find, we should consider what license it uses and how that impacts our ability to use that library. On this page, we’ll briefly discuss some of the licenses and terminology used in industry today.
The information below is my best attempt to help simplify the vastly complex legal documents that make up a software license. However, this simple information may not be enough to fully understand all of the nuances of how a particular software license impacts your ability to use it or distribute it with your own software.
In general, most software that is licensed under one of the more permissive licenses listed below can be safely used in your application, and many (but not all) of them allow you to distribute that software as part of your application as well.
However, when in doubt, you should always read the documents carefully and seek competent legal advice if you are ever unsure. It is always best to make sure you are properly complying with the license of a piece of software you are using.
First, let’s discuss free software. The term “free” has two different meanings, and they are sometimes applied to software interchangeably:
So, when we say software is “free,” it is always important to know which definition of “free” we are using. For example, the Slack messaging application is available for free, meaning no cost, but with some restrictions applied. Google’s Chrome web browser is also free, meaning no cost, and is based on the open source Chromium project, but Chrome itself is not open source since it contains some proprietary software. These free programs are sometimes referred to as “freeware” - meaning that they are available without cost, but still use proprietary source code.
The Linux Kernel is an example of a piece of software that is free and open source, however even it has some restrictions applied to it. Namely, the license of the Linux Kernel requires that any software built using the source code of the kernel (a derivative work ) must also be offered under the same license.
In fact, the Free Software Foundation has developed a set of four “freedoms” that are used to determine if a piece of software can truly be labelled “free”:
Software that meets these four criteria sometimes use the term “libre” software, or FOSS (“Free Open Source Software”) to differentiate themselves from the traditional definition of the word “free”.
So, as we can see, the term “free” really isn’t a great way to discuss software licenses. Instead, we’ll focus in on some more specific licenses and what they mean.
The least restrictive license is the public domain license, meaning that the software can be used by anyone for any purpose, without any restrictions. However, the lack of a license does not mean that the software is in the public domain - quite the opposite in many cases. In the United States, any work, whether published or not, is automatically copyrighted to the original author2 until 70 years after the author’s death. So, to release software into the public domain, a proper license must be attached to the software.
One common public domain license is the Creative Commons CC0 license, with basically waives as many legal rights as possible on any work that the license is applied to. GitHub recommends a similar license called the Unlicense .
Permissive licenses allow few restrictions on the use and distribution of the software.
A permissive license commonly used for software is the MIT License , which grants very little restrictions on the use of the software other than that the license itself should be included in any copies or portions of the software. In addition to granting permission, it also includes a disclaimer stating that the software is offered without any warranty and the authors are not liable for any damages caused by use of the software.
This license is used by a large number of open source projects, and typically is one of the easiest to use.
The next level of licenses are the copyleft licenses. These licenses typically require that any derivative works also be licensed with the same rights. So, if we include a software library that is using a copyleft license in our software, we cannot then make our software proprietary and sell it, as this would violate the copyleft license of that software library.
The most common copyleft license used in software is the GNU General Public License , or GPL, which is used by the Linux Kernel and many other applications typically included as part of a Linux distribution. This requires that any derivative works of the Linux Kernel also be made available under a copyleft license.
One major open question with copyleft licenses - if a piece of software uses a library that is licensed with a copyleft license, but doesn’t distribute that library directly as part of its package, is it still considered a derivative work?
This is a hotly debated question with the Wikipedia Article for the GPL laying out many different points of view on the topic. For example, when a library is statically linked to an application, it is inseparable. However, does the same hold if the library is dynamically linked? Likewise, if a piece of software is just using the public API of a library without modifying the library’s source code, is it a derivative work?
These questions make licensing software that uses libraries under a copyleft license very confusing. Because of this, there is also a GNU Lesser General Public License or GLPL that specifically addresses this issue by allowing other software to link to libraries licensed with the LGPL without it being considered a derivative work.
The last category of software licenses are the unique, proprietary licenses that are attached to software that is not open-source or free. Each one of these licenses can vary widely in the rights given to the users, so we should always read them carefully before using them.
In summary, the licenses of the libraries we use when building our software, as well as any tools or platforms, can all impact the eventual license we can assign to our application if we choose to distribute it. In most cases, we can freely use any open source software for personal use, but as soon as we wish to distribute, or possibly sell, our own software, then we will have to determine what license can be applied to our work. We’ll review that in a later chapter.
The use of software libraries can be complex, as seen in the earlier discussions in this chapter. There are licensing issues to consider, security issues to worry about, and even then we might struggle to find the library that best fits our needs. However, let’s take a step back and review why we should definitely still consider using these libraries wherever possible in our work.
The saying “don’t reinvent the wheel” is a good one to keep in mind when writing new software. In most cases, large parts of any software we wish to write have already been written many times before. So, instead of doing all of the work to recreate that software, we can instead try to find a library or framework that does what we need, and spend our time working on how to make that software fit our needs.
In the article A Padawan Programmer’s Guide to Developing Software Libraries , the authors list a very important lesson as the first lesson any developer should learn when approaching a new project: “Identify a Need for a New Piece of Software.” In essence, whenever we wish to develop something, we should first consider whether it has been done before. If so, it may be worth looking at how we can adapt an existing solution to fit our needs, rather than building a new one from scratch.
A great example of this is building a database-driven website. A naive approach (one taken by this textbook’s author while in college) would be to write all of the code to generate each page by hand, without using any external frameworks or libraries besides the one required to interface with the database. The website would work, but it would be very complex and prone to errors. In addition, maintaining that code could be difficult due to its complexity.
A better solution would be to find a website framework that is able to handle interfacing with the database and generating pages for you, and then customizing those pages to fit our needs.
Likewise, when writing a program to perform statistical analysis or machine learning on some data, we can usually rely on well written, well documented, and typically very efficient libraries to handle the work for us. By doing so, we reduce the risk of our code producing incorrect results due to the algorithm being incorrect (though we could just as easily use the algorithm incorrectly or provide it bad data).
In short, it is always worth taking the time to review the libraries and frameworks available for our programming language. We may find that we can easily combine a few of them together to achieve our desired result.
Of course, relying on a library developed by someone else does have its pitfalls. For example, what if the original developer suddenly decides to “unpublish” the library, making it unavailable for download in the future? That could cause issues for any developer or application that relies on that library, making them also stop working. As libraries are often built on other libraries, such a cascading effect could have dire consequences.
For more information on this event, see this article from Ars Technica.
Update 2022: It happened again! This time the
faker libraries were broken by the developer. See this article
On the next pages, we’ll briefly look at how to work with external libraries in both Java and Python. As always, feel free to review the content for your chosen language, but you are welcome to read both sections if desired.
Java typically uses a special type of file called a JAR file, short for “Java Archive” file, to create a downloadable package that may contain Java source code, compiled class files, and additional data. We can even include compiled Javadoc documentation directly in a JAR file.
Most software libraries for Java are distributed as JAR files, including from the major repositories such as Maven Central . In addition, most websites that offer direct downloading of Java libraries typically use the JAR file format.
A JAR file itself is built using the same format as the ZIP
file format. The Java Runtime Environment includes a special command
jar that can be used to create a JAR file or extract the contents from an existing JAR file.
Finally, a JAR file can include additional information in a manifest file, giving the details such as the version of the software and the developer. The manifest file can also specify the main class of the application included in the JAR file. If so, then the JAR file can effectively be executed as an application, and many operating systems support double-clicking on a JAR file to run it as a Java program.
There are a couple of ways we can install a JAR file into our applications. In effect, we need to add them to our classpath , which is used by the Java compiler and Java runtime environment to locate the resources it needs to operate.
When using Gradle, this process can be greatly simplified. In the
build.gradle file, there are two important sections. First, there is a section for
repositories that lists the repositories we can use for downloading and installing packages:
// Use Maven Central for resolving dependencies.
As shown here, our project will use Maven Central to resolve and install any packages required.
Below that, there is a section for
dependencies that lists the packages required by this project:
// Use JUnit Jupiter API for testing.
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.6.2', 'org.hamcrest:hamcrest:2.2', 'org.junit.jupiter:junit-jupiter-params', 'org.mockito:mockito-inline:3.8.0', 'org.mockito:mockito-junit-jupiter:3.8.0'
// Use JUnit Jupiter Engine for testing.
// This dependency is used by the application.
implementation 'com.google.guava:guava:29.0-jre', files('lib/restaurantregister-v0.1.0.jar')
The dependencies section contains three lists of packages
testImplementation - packages used for compiling unit tests
testRuntimeOnly - packages used for running unit tests
implementation - packages required to compile and run the main source of the application
Typically, we’ll install most libraries by adding them to the
implementation list. In this example, we can see that our application uses two libraries:
restaurantregister, which was manually downloaded as a JAR file that is now contained in the
lib folder of our project’s directory. In this way, we can add any manually downloaded JAR files to our application by simply listing them in the
The Java programming language has many different libraries available for developers to use. Below is a list of some of the most common and useful libraries, as well as links for more information about each one. As we continue to develop more complex softwares, we may want to look at some of these libraries for additional information. We can also browse the repository at Maven Central for additional libraries we could use.
First and foremost, the Java Standard Library contains thousands of classes that we can use in our applications for a variety of purposes. So, before looking elsewhere, it is always worth checking to see if the Java Standard Library already includes what we need.
Beyond the Java Standard Library, there are two other general purpose libraries that are commonly used by Java developers:
Here are a few more libraries that are commonly used by Java developers, some of which we are already using in this course:
Python typically uses a special type of file called a Wheel to create a downloadable package that contains Python source code and any additional resources or bundled libraries required for the package. Wheel files replaced the older “egg” file format that Python used for distribution.
Most software libraries for Python are distributed as wheel files, including from the major repositories such as PyPi .
A wheel file itself is built using the same format as the ZIP
file format. Typically, wheel files themselves are built by the setuptools
library, which is not itself part of the core Python language but can be quickly installed as a package using
Finally, a wheel file can include additional information about the software, giving the details such as the version of the software and the developer.
Thankfully, installing a Python wheel file is very simple. Most recent versions of the
pip3 tool will handle this automatically via one of two methods.
pip3 install <packagename> to find and download the package from PyPi
. Most package entries on PyPi give the exact command needed to install them.
pip3 install <file>, where
<file> is the path and name of a wheel file we downloaded manually.
In either case,
pip3 will handle downloading, extracting and installing the Python wheel file on our system so it is ready for us to use in our Python applications.
As we learned in the “Hello Real World” project, we can also list these requirements in a
requirements.txt file to have them automatically installed by
pip3 when we use the
tox command to automate checking and testing our application. In that case, we typically store any manually downloaded wheel files in a folder named
lib inside of our package directory, and then we can add entries to
tox.ini that look like
lib/<filename>.whl to make sure the wheel file is installed properly in the virtual environment as well.
The Python programming language has many different libraries available for developers to use. Below is a list of some of the most common and useful libraries, as well as links for more information about each one. As we continue to develop more complex softwares, we may want to look at some of these libraries for additional information. We can also browse the repository at PyPi for additional libraries we could use.
First and foremost, the Python Standard Library contains hundreds of classes that we can use in our applications for a variety of purposes. So, before looking elsewhere, it is always worth checking to see if the Java Standard Library already includes what we need.
Here are a few more libraries that are commonly used by Python developers, some of which we are already using in this course:
In this chapter, we learned about software libraries and how we can use them in our applications. We explored the different types of libraries, including static libraries, shared libraries, and class libraries. We discussed the differences between libraries and frameworks and how to tell the difference. We also covered repositories and how to search for and download helpful software packages we can use.
In addition, we discussed the various licenses that may be attached to software libraries we use in our code, and how that may impact our ability to license and distribute our software in the future. We also looked at why it is worth the hassle of finding and downloading these libraries instead of writing the code ourselves.
Finally, we looked at the Java JAR and Python wheel file formats, and how to install those packages into our applications. We also listed some common software packages for both Java and Python that we may want to use ourselves.
In the example project for this chapter, we’ll look at how to download and install a custom package for our class project and how to integrate it into our code.
Check your understanding of the new content introduced in this chapter below - this quiz is not graded and you can retake it as many times as you want.Quizdown quiz omitted from print view.