Chapter 5

UML

A unified way to model your software’s structure!

Subsections of UML

Introduction

Content Note

Much of the content in this chapter was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.

As software systems became more complex, it became harder to talk and reason about them. Unified Modeling Language (UML) attempted to correct for this by providing a visual, diagrammatic approach to communicate the structure and function of a program. If a picture is worth a thousand words, a UML diagram might be worth a thousand lines of code.

Key Terms

Some key terms to learn in this chapter are:

  • Unified Modeling Language
  • Class Diagrams
  • Typed Elements
  • Constraints
  • Stereotypes
  • Attributes
  • Operations
  • Association
  • Generalization
  • Realization
  • Composition
  • Aggregation

Key Skills

The key skill to learn in this chapter is how to draw UML class diagrams for programs we are developing.

UML

YouTube Video

Video Materials

UML Logo UML Logo1

Unified Modeling Language (UML) was introduced to create a standardized way of visualizing a software system design. It was developed by Grady Booch, Ivar Jacobson, and James Rumbah at Rational Software in the mid-nineties. It was adopted as a standard by the Object Management Group in 1997, and also by the International Organization for Standardization (ISO) as an approved ISO standard in 2005.

The UML standard actually provides many different kinds of diagrams for describing a software system - both structure and behavior:

  • Class Diagram A class diagram visualizes the structure of the classes in the software, and the relationships between these classes.
  • Component Diagram A component diagram visualizes how the software system is broken into components, and how communication between those components is achieved.
  • Activity Diagram An activity diagram represents workflows in a step-by-step process for actions. It is used to model data flow in a software system.
  • Use-Case Diagram A use-case diagram identifies the kinds of users a software system will have, and how they work with the software.
  • Sequence Diagram A sequence diagram shows object interactions arranged in chronological sequences.
  • Communication Diagram A communication diagram models the interactions between objects in terms of sequences of messages.

The full UML specification is 754 pages long, so there is a lot of information packed into it. For the purposes of this class, we’re focusing on a single kind of diagram - the class diagram.

Subsections of UML

Boxes

UML class diagrams are largely composed of boxes - basically a rectangular border containing text. UML class diagrams use boxes to represent units of code - i.e. classes, structs, and enumerations. These boxes are broken into compartments. For example, an Enum is broken into two compartments:

A UML Enum representation A UML Enum representation

Stereotypes

UML is intended to be language-agnostic. But we often find ourselves in situations where we want to convey language-specific ideas, and the UML specification leaves room for this with stereotypes. Stereotypes consist of text enclosed in double less than and greater than symbols. In the example above, we indicate the box represents an enumeration with the <<enum>> stereotype. Another commonly used stereotype is the <<interface>> stereotype that is used with interfaces in Java.

Typed Elements

A second basic building block for UML diagrams is a typed element. Typed elements (as you might expect from the name) have a type. Fields and parameters are typed elements, as are method parameters and return values.

The pattern for defining a typed element is:

[visibility] element: type [constraint]

The optional [visibility] indicates the visibility of the element, the element is the name of the typed element, and the type is its type, and the [constraint] is an optional constraint.

Visibility

In UML visibility (based on access modifiers in Java, or the use of underscores in Python) is indicated with symbols, i.e.:

  • + indicates public access.
  • - indicates private access.
  • # indicates protected access, which we will discuss in a later chapter.

Consider, for example, a private size field. In a Java class, we would do the following:

Java
private int size;

Consider, for example, a private size field. In Python, we might have the following assignment in our constructor:

Python
self.__size: int = 0;

In a UML diagram, that field would be expressed as:

- size: int

Constraints

A typed element can include a constraint indicating some restriction for the element. The constraints are contained in a pair of curly braces after the typed element, and follow the pattern:

{element: boolean expression}

For example:

- age: int {age: >= 0}

indicates the private variable age must be greater than or equal to 0.

Classes

YouTube Video

Video Materials

In a UML class diagram, individual classes are represented with a box divided into three compartments, each of which is for displaying specific information:

Class Diagram example Class Diagram example

The first compartment identifies the class - it contains the name of the class. The second compartment holds the attributes of the class (the fields and properties). And the third compartment holds the operations of the class (the methods) of the class.

In the diagram above, we can see the Fruit class modeled on the right side.

Java vs. Python in UML

UML is a very flexible tool, but it can become difficult to create UML diagrams that accurately reflect the differences between programming languages. So, different developers might implement the same UML class diagram in slightly different ways.

For example, in Java we would use a boolean data type to represent a Boolean value, whereas Python uses the bool type. Likewise, Java also includes a class called Boolean that is an object wrapper around a primitive boolean variable, allowing it to be used in various Java collections. Additionally, some other languages do not include a Boolean data type at all, and instead use a small integer with 0 representing true and other values representing false.

In prior CC courses, it was important for the software to exactly match the specification so that our autograders would work. In that case, we provided UML diagrams that were somewhat unique to each programming language. For this course, we will create UML diagrams that are a bit more generalized.

In the descriptions below, we’ll include discussions of ways to properly represent each UML element for each language, but it may allow for some flexibility. In general, as long a similarly experienced developer can follow the UML diagram and/or the source code and correlate the two, we will consider that good enough.

Attributes

The attributes in UML represent the state of an object. For most object-oriented languages, this would correspond to the fields and properties of the class.

We indicate fields in our UML diagram with a typed element. So, to create a private Boolean variable named blended, we would include the following:

- blended: boolean
- blended: bool

For Python, we may also choose to include the underscores in front of the name to show that it should be treated as a private attribute, as implied by the - at the start of the element:

- __blended: bool

However, this can make the UML a bit more difficult to read, so we generally won’t do this in the UML diagrams in this course.

Accessor Methods

Java and Python handle accessor methods differently, and they can be denoted in UML in many different ways.

A general solution would be to include a stereotype after the element, indicating if a public getter or setter should be created for that element. So, to create a getter and a setter for our blended attribute, we could do the following:

- blended: boolean <<get,set>>
- blended: bool <<get,set>>

Of course, each language would handle this a bit differently. In Java, we would create public getBlended() and setBlended(boolean) methods in our class. In Python, we would use the @property and @blended.setter decorators to create a Python property. While all of those are technically methods, they are really meant to implement the functionality of an attribute, so we’ll treat them as part of the attribute in UML.

What if our accessors implement unique functionality, or we want one of them to be protected instead of public? In those cases, we may want to include the explicit accessor methods as operations as described below. However, in general, it is best practice to make our UML as concise as possible, so we generally don’t list accessor methods directly unless there is a good reason to do so.

Operations

The operations in UML represent the behavior of the object, i.e. the methods we can invoke upon it. These are declared using the pattern:

visibility name([parameter list])[:return type]

The [visibility] portion uses the same symbols as typed elements, with the same correspondences. The name is the name of the method, and the [parameter list] is a comma-separated list of typed elements, corresponding to the parameters of the method. The [:return type] indicates the return type for the method. That portion can be omitted if the method doesn’t explicitly return a value (void in Java or None in Python).

Thus, in the example above, the protected method Blend has no parameters and returns a string.

Consider a method that adds together two integers and returns the result. The examples below show how the method’s signature corresponds to its UML element.

public int add(int a, int b){
    return a + b;
}
def add(a: int, b: int) -> int:
    return a + b
UML
+ add(a: int, b: int): int

Static and Abstract

In UML, we indicate a class is static by underlining its name in the first compartment of the class diagram. We can similarly indicate operations and methods are static by underlining the entire line referring to them.

To indicate a class is abstract, we italicize its name. Abstract methods are also indicated by italicizing the entire line referring to them.

We’ll talk more about some of these concepts in a later chapter.

Subsections of Classes

Associations

YouTube Video

Video Materials

Class diagrams also express the associations between classes by drawing lines between the boxes representing them.

UML Association UML Association

There are two basic types of associations we model with UML: has-a and is-a associations. We break these into two further categories, based on the strength of the association, which is either strong or weak. These associations are:

Association Name Association Type Typical Usage
Realization weak is-a Interfaces
Generalization strong is-a Inheritance
Aggregation weak has-a Collections
Composition strong has-a Encapsulation

Is-A Associations

Is-a associations indicate a relationship where one class is a instance of another class. Thus, these associations represent polymorphism, where a class can be treated as another class, i.e. it has both its own, and the associated classes’ types.

Realization (Weak is-a)

Realization refers to making an interface “real” by implementing the methods it defines. An interface is a special type of abstract class that only includes abstract methods. In effect, it is creating an defined list of operations, or an interface (or API), that subclasses must include so that they can all be used in the same way. For Java, this corresponds to a class that implements an interface. The Python language doesn’t have interfaces, but we’ll learn how to create something similar using abstract classes. We call this a is-a relationship, because the class can be treated as being the same data type of the interface class. It is also a weak relationship as the same interface can be implemented by otherwise unrelated classes. In UML, realization is indicated by a dashed arrow in the direction of implementation:

Realization in UML Realization in UML

Generalization

Generalization refers to extracting the shared parts from different classes to make a general base class of what they have in common. For Java and Python, this corresponds to inheritance. We call this a strong is-a relationship, because the class has all the same state and behavior as the base class. In UML, realization is indicated by a solid arrow in the direction of inheritance:

Generalization in UML Generalization in UML

Also notice that we show that Fruit and its blend() method are abstract by italicizing them. The association tells us that the Banana class is a Fruit.

Has-A Associations

Has-a associations indicates that a class holds one or more references to instances of another class. In Java or Python, this corresponds to having a variable or collection with the type of the associated class. This is true for both kinds of has-a associations. The difference between the two is how strong the association is.

Aggregation

Aggregation refers to collecting references to other classes. As the aggregating class has references to the other classes, we call this a has-a relationship. It is considered weak because the aggregated classes are only collected by the aggregating class, and can exist on their own. It is indicated in UML by a solid line from the aggregating class to the one it aggregates, with an open diamond “fletching” on the opposite site of the arrow (the arrowhead is optional).

Aggregation in UML Aggregation in UML

Composition

Composition refers to assembling a class from other classes, “composing” it. As the composed class has references to the other classes, we call this a has-a relationship. However, the composing class typically creates the instances of the classes composing it, and they are likewise destroyed when the composing class is destroyed. For this reason, we call it a strong relationship. It is indicated in UML by a solid line from the composing class to those it is composed of, with a solid diamond “fletching” on the opposite side of the arrow (the arrowhead is optional).

Composition in UML Composition in UML

Aggregation vs. Composition

Aggregation and composition are commonly confused, especially given they both are defined by holding a variable or collection of another class type. Here’s a helpful analogy to explain the difference, based on the diagrams listed above:

Aggregation is like a shopping cart. When you go shopping, you place groceries into the shopping cart, and it holds them as you push it around the store. Thus, a ShoppingCart class might have a List<Grocery> named items, and you would add the items to it. When you reach the checkout, you would then take the items back out. The individual Grocery objects existed before they were aggregated by the ShoppingCart, and also after they were removed from it. The ShoppingCart class just keeps track of them.

In contract, composition is like an organism. Say we create a class representing a Dog. It might be composed of classes like Tongue, Ear, Leg, and Tail. We would probably construct these parts in the Dog class’s constructor, and when we dispose of the Dog object, we wouldn’t expect these component classes to stick around. So, they are inherently a part of the encapsulating class.

Additionally, sometimes the attributes containing these external items may be omitted from the UML diagram of the composing or aggregating class. This is mainly because the existence of those attributes can be inferred by the relationships themselves. However, in this course, we will include the relevant attributes in the encapsulating class, as well as the association arrows, in our UML diagrams

Multiplicity

With aggregation and composition, we may also place numbers on either end of the association, indicating the number of objects involved. We call these numbers the multiplicity of the association.

Composition in UML Composition in UML

For example, the Frog class in the composition example has two instances of front and rear legs, so we indicate that each Frog instance (by a 1 on the Frog side of the association) has exactly two (by the 2 on the leg side of the association) legs. The tongue has a 1 to 1 multiplicity as each frog has one tongue.

Aggregation in UML Aggregation in UML

Multiplicities can also be represented as a range (indicated by the start and end of the range separated by ..). We see this in the ShoppingCart example above, where the count of GroceryItems in the cart ranges from 0 to infinity (infinity is indicated by an asterisk *).

Generalization and realization are always one-to-one multiplicities, so multiplicities are typically omitted for these associations.

Subsections of Associations

Creating UML Diagrams

There are many tools available to help you develop your own UML diagrams. Here are a few that we recommend using for this course.

Diagrams.net

Diagrams.net Interface Diagrams.net Interface

Most of the graphics used in the Computational Core program, including the UML diagrams in this and previous courses, are made using the free Diagrams.net tool.

When creating a new diagram, you can select the UML Diagram template to get started. The interface is really simple and easy to use, with lots of drag-and-drop components you can add to your diagram.

To create multiplicities, you can simply add text boxes to your arrows.

To export a diagram, click the File menu and choose the Export To option. You can create both PNG and SVG files!

Diagrams in Image Files

One great feature of Diagrams.net is the ability to embed the diagram data directly into an image file exported from the application. In that way, we only have to have access to the image in order to open the diagram and update the image.

Try it yourself! Right-click on a UML diagram in this book to download it as an image, and then open the image using the upload option in Diagrams.net. You should be able to edit the diagram!

Visio

Another tool we can use to create UML diagrams is Microsoft Visio. For Kansas State University Computer Science students, this can be downloaded through your Azure Student Portal.

Visio is a vector graphics editor for creating flowcharts and diagrams. it comes preloaded with a UML class diagram template, which can be selected when creating a new file:

Visio Template Visio Template

Class diagrams are built by dragging shapes from the shape toolbox onto the drawing surface. Notice that the shapes include classes, interfaces, enumerations, and all the associations we have discussed. Once in the drawing surface, these can be resized and edited.

Right-clicking on an association will open a context menu, allowing you to turn on multiplicities. These can be edited by double-clicking on them. Unneeded multiplicities can be deleted.

To export a Visio project in PDF or other form, choose the “Export” option from the file menu.

UML Example

Let’s work through an example of creating a UML class diagram based on existing code. This is loosely based off a project from an earlier course, so some of the structure may be familiar.

The Project

This project is a number calculator that makes use of object-oriented concepts such as inheritance, interfaces, and polymorphism to represent different types of numbers using different classes. We’ll also follow the Model-View-Controller (MVC) architectural pattern.

Number Interface

We’ll start by looking at the Number interface, which is the basis of all of the number classes. We’re omitting the method code in these examples, since we are only concerned with the overall structure of the classes themselves.

public interface Number {
    Number add(Number n);
    Number subtract(Number n);
    Number multiply(Number n);
    Number divide(Number n);
}
class Number(metaclass=abc.ABCMeta):

    @classmethod
    def __subclasshook__(cls, subclass: type) -> bool:
        
    @abc.abstractmethod
    def add(self, n: Number) -> Number:

    @abc.abstractmethod
    def subtract(self, n: Number) -> Number:

    @abc.abstractmethod
    def multiply(self, n: Number) -> Number:

    @abc.abstractmethod
    def divide(self, n: Number) -> Number:

In UML, we’d represent this interface using the following box. It includes the <<interface>> stereotype, as well as the listed methods shown in italics since they are all abstract. Finally, each method in an interface is assumed to be public, so we’ll include a plus symbol + in front of each method.

Number Interface Number Interface

Real Number Class

Next is the class for representing real numbers. This class will be a realization of the Number interface, as we can see in the code:

public class RealNumber implements Number {

    private double value;

    public RealNumber(double value){ }

    public Number add(Number n){ }

    public Number subtract(Number n){ }

    public Number multiply(Number n){ }

    public Number divide(Number n){ }

    @Override
    public String toString(){ }

    @Override
    public boolean equals(Object o){ }
}
class RealNumber(Number):

    def __init__(self, value: float) -> None:
        self.__value = value
        
    def add(self, n: Number) -> Number:

    def subtract(self, n: Number) -> Number:

    def multiply(self, n: Number) -> Number:

    def divide(self, n: Number) -> Number:

    def __str__(self) -> str:

    def __eq__(self, o: object) -> bool:

it also includes implementations for a couple of other methods beyond the interface, including a constructor. So, in our UML diagram, we’ll add another box to represent that class, and use the realization association arrow to show the connection between the classes. Remember that the arrow itself points toward the interface or parent class.

RealNumber Class RealNumber Class

Other Number Classes

From here, it’s pretty easy to see how we can use inheritance to create a RationalNumber class and an IntegerNumber class. The only way that they differ from the RealNumber class are the attributes. So, we’ll quickly add those to our UML diagram as well.

All Number Classes All Number Classes

Complex Numbers

At this point, we can add a new class to represent complex numbers. A complex number consists of two parts - a real part and an imaginary part. So, it will both implement the Number interface, but it will also be composed of two RealNumber attributes. Notice that we’re using RealNumber as the attribute instead of the Number interface. This is because we don’t want a complex number to contain a complex number, so we’re being careful about our inheritance. In code, this class would look like this:

public class ComplexNumber implements Number {

    private RealNumber real;
    private RealNumber imaginary;

    public ComplexNumber(RealNumber real, RealNumber imaginary){ }

    public Number add(Number n){ }

    public Number subtract(Number n){ }

    public Number multiply(Number n){ }

    public Number divide(Number n){ }

    @Override
    public String toString(){ }

    @Override
    public boolean equals(Object o){ }
}
class ComplexNumber(Number):

    def __init__(self, real: RealNumber, imaginary: RealNumber) -> None:
        self.__real = real
        self.__imaginary = imaginary
        
    def add(self, n: Number) -> Number:

    def subtract(self, n: Number) -> Number:

    def multiply(self, n: Number) -> Number:

    def divide(self, n: Number) -> Number:

    def __str__(self) -> str:

    def __eq__(self, o: object) -> bool:

In our UML diagram, we’ll add a box for this class. We’ll also add both a realization association to the Number interface, but also a composition association to the RealNumber class, complete with the cardinality of the relationship.

Imaginary Numbers Imaginary Numbers

MVC Components

Once we’ve created all of our number classes, we can quickly create our View and Controller classes as well. They will handle getting input from the user, performing operations, and displaying the results.

public class View {

    public View(){ }

    public void show(Number n){ }

    public String input(){ }

}

public class Controller {

    private List<Number> numbers;
    private View view;

    public Controller(){ }

    public void build(){ }
    
    public void sum(){ }

    public static void main(String[] args){ }
}
class View:

    def __init__(self) -> None:

    def show(self, n: Number) -> None:

    def input(self) -> str:


class Controller:

    def __init__(self) -> None:
        self.__numbers: List[Number] = list()
        self.__view: View = View()
    
    def build(self) -> None:

    def sum(self) -> None:

    @classmethod
    def main(self, args: List[str]) -> None:

In the code, we see that the Controller class contains an attribute for a single View() instance, and also a list of Number instances. So, we’ll end up using a composition association between Controller and View, and an aggregation association between Controller and the Number interface.

Full UML Full UML

This is a small example, but it demonstrates many of the important object-oriented concepts in a single UML diagram:

  • The Number class is an interface and abstract class
  • RealNumber implements the Number class through a realization association
  • RationalNumber and IntegerNumber show direct inheritance through a generalization association
  • ImaginaryNumber contains two RealNumber instances, showing the composition association and a multiplicity of 2.
  • The Controller, View and Number classes make up the various parts of an MVC architecture.
  • The Controller stores a list of Number instances, demonstrating the aggregation association.
  • The Controller also contains a single View instance, which is another composittion association with multiplicity of 1.

Further Reading

UML is a very broad topic to cover in a single module, let alone a single class. For more information on building and reading UML diagrams, refer to these sources:

There are also many textbooks devoted to teaching UML concepts, as well as lots of examples online to learn from. The O’Reilly subscription through the K-State Libraries offers several books to choose from that can be accessed for free through this link:

Summary

In this section, we learned about UML class diagrams, a language-agnostic approach to visualizing the structure of an object-oriented software system. We saw how individual classes are represented by boxes divided into three compartments; the first for the identity of the class, the second for its attributes, and the third for its operators. We learned that italics are used to indicate abstract classes and operators, and underlining static classes, attributes, and operators.

We also saw how associations between classes can be represented by arrows with specific characteristics, and examined four of these in detail: aggregation, composition, generalization, and realization. We also learned how multiplicities can show the number of instances involved in these associations.

Finally, we saw how classes, interfaces, and enumerations are modeled using UML. We saw how the stereotype can be used to indicate language-specific features like properties. We also looked at creating UML class diagrams using Diagrams.net and Microsoft Visio.

Review Quiz

Check your understanding of the new content introduced in this chapter below - this quiz is not graded and you can retake it as many times as you want.

Quizdown quiz omitted from print view.