Kubernetes 博客

Wednesday, September 18, 2019

Kubernetes 1.16: Custom Resources, Overhauled Metrics, and Volume Extensions

Authors: Kubernetes 1.16 Release Team

We’re pleased to announce the delivery of Kubernetes 1.16, our third release of 2019! Kubernetes 1.16 consists of 31 enhancements: 8 enhancements moving to stable, 8 enhancements in beta, and 15 enhancements in alpha.

Major Themes

Custom resources

CRDs are in widespread use as a Kubernetes extensibility mechanism and have been available in beta since the 1.7 release. The 1.16 release marks the graduation of CRDs to general availability (GA).

Overhauled metrics

Kubernetes has previously made extensive use of a global metrics registry to register metrics to be exposed. By implementing a metrics registry, metrics are registered in more transparent means. Previously, Kubernetes metrics have been excluded from any kind of stability requirements.

Volume Extension

There are quite a few enhancements in this release that pertain to volumes and volume modifications. Volume resizing support in CSI specs is moving to beta which allows for any CSI spec volume plugin to be resizable.

Additional Enhancements

Custom Resources Reach General Availability

CRDs have become the basis for extensions in the Kubernetes ecosystem. Started as a ground-up redesign of the ThirdPartyResources prototype, they have finally reached GA in 1.16 with apiextensions.k8s.io/v1, as the hard-won lessons of API evolution in Kubernetes have been integrated. As we transition to GA, the focus is on data consistency for API clients.

As you upgrade to the GA API, you’ll notice that several of the previously optional guard rails have become required and/or default behavior. Things like structural schemas, pruning unknown fields, validation, and protecting the *.k8s.io group are important for ensuring the longevity of your APIs and are now much harder to accidentally miss. Defaulting is another important part of API evolution and that support will be on by default for CRD.v1. The combination of these, along with CRD conversion mechanisms are enough to build stable APIs that evolve over time, the same way that native Kubernetes resources have changed without breaking backward-compatibility.

Updates to the CRD API won’t end here. We have ideas for features like arbitrary subresources, API group migration, and maybe a more efficient serialization protocol, but the changes from here are expected to be optional and complementary in nature to what’s already here in the GA API. Happy operator writing!

Details on how to work with custom resources can be found in the Kubernetes documentation.

Opening Doors With Windows Enhancements

Beta: Enhancing the workload identity options for Windows containers

Active Directory Group Managed Service Account (GMSA) support is graduating to beta and certain annotations that were introduced with the alpha support are being deprecated. GMSA is a specific type of Active Directory account that enables Windows containers to carry an identity across the network and communicate with other resources. Windows containers can now gain authenticated access to external resources. In addition, GMSA provides automatic password management, simplified service principal name (SPN) management, and the ability to delegate the management to other administrators across multiple servers.

Adding support for RunAsUserName as an alpha release. The RunAsUserName is a string specifying the windows identity (or username) in Windows to run the entrypoint of the container and is a part of the newly introduced windowsOptions component of the securityContext (WindowsSecurityContextOptions).

Alpha: Improvements to setup & node join experience with kubeadm

Introducing alpha support for kubeadm, enabling Kubernetes users to easily join (and reset) Windows worker nodes to an existing cluster the same way they do for Linux nodes. Users can utilize kubeadm to prepare and add a Windows node to cluster. When the operations are complete, the node will be in a Ready state and able to run Windows containers. In addition, we will also provide a set of Windows-specific scripts to enable the installation of prerequisites and CNIs ahead of joining the node to the cluster.

Alpha: Introducing support for Container Storage Interface (CSI)

Introducing CSI plugin support for out-of-tree providers, enabling Windows nodes in a Kubernetes cluster to leverage persistent storage capabilities for Windows-based workloads. This significantly expands the storage options of Windows workloads, adding onto a list that included FlexVolume and in-tree storage plugins. This capability is achieved through a host OS proxy that enables the execution of privileged operations on the Windows node on behalf of containers.

Introducing Endpoint Slices

The release of Kubernetes 1.16 includes an exciting new alpha feature: Endpoint Slices. These provide a scalable and extensible alternative to Endpoints resources. Behind the scenes, these resources play a big role in network routing within Kubernetes. Each network endpoint is tracked within these resources, and kube-proxy uses them for generating proxy rules that allow pods to communicate with each other so easily in Kubernetes.

Providing Greater Scalability

A key goal for Endpoint Slices is to enable greater scalability for Kubernetes Services. With the existing Endpoints resources, a single resource must include network endpoints representing all pods matching a Service. As Services start to scale to thousands of pods, the corresponding Endpoints resources become quite large. Simply adding or removing one endpoint from a Service at this scale can be quite costly. As the Endpoints resource is updated, every piece of code watching Endpoints will need to be sent a full copy of the resource. With kube-proxy running on every node in a cluster, a copy needs to be sent to every single node. At a small scale, this is not an issue, but it becomes increasingly noticeable as clusters get larger.

As a simple example, in a cluster with 5,000 nodes and a 1MB Endpoints object, any update would result in approximately 5GB transmitted (that’s enough to fill a DVD). This becomes increasingly significant given how frequently Endpoints can change during events like rolling updates on Deployments.

With Endpoint Slices, network endpoints for a Service are split into multiple resources, significantly decreasing the amount of data required for updates at scale. By default, Endpoint Slices are limited to 100 endpoints each.

For example, let’s take a cluster with 20,000 network endpoints spread over 5,000 nodes. Updating a single endpoint will be much more efficient with Endpoint Slices since each one includes only a tiny portion of the total number of network endpoints. Instead of transferring a big Endpoints object to each node, only the small Endpoint Slice that’s been changed has to be transferred. The net effect is that approximately 200x less data needs to be transferred for this update.

Endpoints Endpoint Slices
# of resources 1 20k / 100 = 200
# of network endpoints stored 1 * 20k = 20k 200 * 100 = 20k
size of each resource 20k * const = ~2.0 MB 100 * const = ~10 kB
watch event data transferred ~2.0MB * 5k = 10GB ~10kB * 5k = 50MB

The second primary goal for Endpoint Slices was to provide a resource that would be highly extensible and useful across a wide variety of use cases. One of the key additions with Endpoint Slices involves a new topology attribute. By default, this will be populated with the existing topology labels used throughout Kubernetes indicating attributes such as region and zone. Of course, this field can be populated with custom labels as well for more specialized use cases.

Endpoint Slices also include greater flexibility for address types. Each contains a list of addresses. An initial use case for multiple addresses would be to support dual stack endpoints with both IPv4 and IPv6 addresses.

The Kubernetes documentation has a lot more information about Endpoint Slices. There’s also a great KubeCon talk that provides more information on the initial rationale for developing Endpoint Slices. As an alpha feature in Kubernetes 1.16, they will not be enabled by default, but the docs cover how to enable them in your cluster.

Notable Feature Updates

  • Topology Manager, a new Kubelet component, aims to co-ordinate resource assignment decisions to provide optimized resource allocations.
  • IPv4/IPv6 dual-stack enables the allocation of both IPv4 and IPv6 addresses to Pods and Services.
  • Extensions for Cloud Controller Manager Migration.
  • Continued deprecation of extensions/v1beta1, apps/v1beta1, and apps/v1beta2 APIs. These extensions are now retired in 1.16!

Availability

Kubernetes 1.16 is available for download on GitHub. To get started with Kubernetes, check out these interactive tutorials. You can also easily install 1.16 using kubeadm.

Release Team

This release is made possible through the efforts of hundreds of individuals who contributed both technical and non-technical content. Special thanks to the release team led by Lachlan Evenson, Principal Program Manager at Microsoft. The 32 individuals on the release team coordinated many aspects of the release, from documentation to testing, validation, and feature completeness.

As the Kubernetes community has grown, our release process represents an amazing demonstration of collaboration in open source software development. Kubernetes continues to gain new users at a rapid pace. This growth creates a positive feedback cycle where more contributors commit code creating a more vibrant ecosystem. Kubernetes has had over 32,000 individual contributors to date and an active community of more than 66,000 people.

Release Mascot

The Kubernetes 1.16 release crest was loosely inspired by the Apollo 16 mission crest. It represents the hard work of the release-team and the community alike and is an ode to the challenges and fun times we shared as a team throughout the release cycle. Many thanks to Ronan Flynn-Curran of Microsoft for creating this magnificent piece.

Kubernetes 1.16 Release Mascot

Kubernetes Updates

Project Velocity

The CNCF has continued refining DevStats, an ambitious project to visualize the myriad contributions that go into the project. K8s DevStats illustrates the breakdown of contributions from major company contributors, as well as an impressive set of preconfigured reports on everything from individual contributors to pull request lifecycle times. This past year, 1,147 different companies and over 3,149 individuals contribute to Kubernetes each month. Check out DevStats to learn more about the overall velocity of the Kubernetes project and community.

Ecosystem

  • The Kubernetes project leadership created the Security Audit Working Group to oversee the very first third-part Kubernetes security audit, in an effort to improve the overall security of the ecosystem.
  • The Kubernetes Certified Service Providers program (KCSP) reached 100 member companies, ranging from the largest multinational cloud, enterprise software, and consulting companies to tiny startups.
  • The first Kubernetes Project Journey Report was released, showcasing the massive growth of the project.

KubeCon + CloudNativeCon

The Cloud Native Computing Foundation’s flagship conference gathers adopters and technologists from leading open source and cloud native communities in San Diego, California from November 18-21, 2019. Join Kubernetes, Prometheus, Envoy, CoreDNS, containerd, Fluentd, OpenTracing, gRPC, CNI, Jaeger, Notary, TUF, Vitess, NATS, Linkerd, Helm, Rook, Harbor, etcd, Open Policy Agent, CRI-O, and TiKV as the community gathers for four days to further the education and advancement of cloud native computing. Register today!

Webinar

Join members of the Kubernetes 1.16 release team on Oct 22, 2019 to learn about the major features in this release. Register here.

Get Involved

The simplest way to get involved with Kubernetes is by joining one of the many Special Interest Groups (SIGs) that align with your interests. Have something you’d like to broadcast to the Kubernetes community? Share your voice at our weekly community meeting, and through the channels below. Thank you for your continued feedback and support.

0001.01.01


title: “SIG-Networking: Kubernetes Network Policy APIs Coming in 1.3 “ date: 2016-04-18 slug: kubernetes-network-policy-apis

url: /blog/2016/04/Kubernetes-Network-Policy-APIs

编者按:这一周,我们的封面主题是 Kubernetes 特别兴趣小组;今天的文章由网络兴趣小组撰写,来谈谈 1.3 版本中即将出现的网络策略 API - 针对安全,隔离和多租户的策略。

自去年下半年起,Kubernetes 网络特别兴趣小组经常定期开会,讨论如何将网络策略带入到 Kubernetes 之中,现在,我们也将慢慢看到这些工作的成果。

很多用户经常会碰到的一个问题是, Kubernetes 的开放访问网络策略并不能很好地满足那些需要对 pod 或服务( service )访问进行更为精确控制的场景。今天,这个场景可以是在多层应用中,只允许临近层的访问。然而,随着组合微服务构建原生应用程序潮流的发展,如何控制流量在不同服务之间的流动会别的越发的重要。

在大多数的(公共的或私有的) IaaS 环境中,这种网络控制通常是将 VM 和“安全组”结合,其中安全组中成员的通信都是通过一个网络策略或者访问控制表( Access Control List, ACL )来定义,以及借助于网络包过滤器来实现。

“网络特别兴趣小组”刚开始的工作是确定 特定的使用场景 ,这些用例需要基本的网络隔离来提升安全性。 让这些API恰如其分地满足简单、共通的用例尤其重要,因为它们将为那些服务于 Kubernetes 内多租户,更为复杂的网络策略奠定基础。

根据这些应用场景,我们考虑了集中不同的方法,然后定义了一个最简策略规范。 基本的想法是,如果是根据命名空间的不同来进行隔离,那么就会根据所被允许的流量类型的不同,来选择特定的 pods 。

快速支持这个实验性 API 的办法是往 API 服务器上加入一个 ThirdPartyResource 扩展,这在 Kubernetes 1.2 就能办到。

如果你还不是很熟悉这其中的细节, Kubernetes API 是可以通过定义 ThirdPartyResources 扩展在特定的 URL 上创建一个新的 API 端点。

third-party-res-def.yaml

kind: ThirdPartyResource
apiVersion: extensions/v1beta1
metadata:
	- name: network-policy.net.alpha.kubernetes.io
description: "Network policy specification"
versions:
	- name: v1alpha1
$kubectl create -f third-party-res-def.yaml

这条命令会创建一个 API 端点(每个命名空间各一个):

/net.alpha.kubernetes.io/v1alpha1/namespace/default/networkpolicys/

第三方网络控制器可以监听这些端点,根据资源的创建,修改或者删除作出必要的响应。 注意:在接下来的 Kubernetes 1.3 发布中, Network Policy API 会以 beta API 的形式出现,这也就不需要像上面那样,创建一个 ThirdPartyResource API 端点了。

网络隔离默认是关闭的,因而,所有的 pods 之间可以自由地通信。 然而,很重要的一点是,一旦开通了网络隔离,所有命名空间下的所有 pods 之间的通信都会被阻断,换句话说,开通隔离会改变 pods 的行为。

网络隔离可以通过定义命名空间, net.alpha.kubernetes.io 里的 network-isolation 注释来开通关闭:

net.alpha.kubernetes.io/network-isolation: [on | off]

一旦开通了网络隔离,一定需要使用 显示的网络策略来允许 pod 间的通信。

一个策略规范可以被用到一个命名空间中,来定义策略的细节(如下所示):

POST /apis/net.alpha.kubernetes.io/v1alpha1/namespaces/tenant-a/networkpolicys/
{
  "kind": "NetworkPolicy",
  "metadata": {
    "name": "pol1"
  },
  "spec": {
    "allowIncoming": {
      "from": [
        {
          "pods": {
            "segment": "frontend"
          }
        }
      ],
      "toPorts": [
        {
          "port": 80,
          "protocol": "TCP"
        }
      ]
    },
    "podSelector": {
      "segment": "backend"
    }
  }
}

在这个例子中,tenant-a 空间将会使用 pol1 策略。 具体而言,带有 segment 标签为 backend 的 pods 会允许 segment 标签为 frontend 的 pods 访问其端口 80 。

今天,Romana, OpenShift, OpenContrail 以及 Calico 都已经支持在命名空间和pods中使用网络策略。 而 Cisco 和 VMware 也在努力实现支持之中。 Romana 和 Calico 已经在最近的 KubeCon 中展示了如何在 Kubernetes 1.2 下使用这些功能。 你可以在这里看到他们的演讲: Romana (幻灯片), Calico (幻灯片).

这是如何工作的

每套解决方案都有自己不同的具体实现。尽管今天,他们都借助于每种主机上( on-host )的实现机制,但未来的实现可以通过将策略使用在 hypervisor 上,亦或是直接使用到网络本身上来达到同样的目的。

外部策略控制软件(不同实现各有不同)可以监听 pods 创建以及新加载策略的 API 端点。 当产生一个需要策略配置的事件之后,监听器会确认这个请求,相应的,控制器会配置接口,使用该策略。 下面的图例展示了 API 监视器和策略控制器是如何通过主机代理在本地应用网络策略的。 这些 pods 的网络接口是使用过主机上的 CNI 插件来进行配置的(并未在图中注明)。

controller.jpg

如果你一直受网络隔离或安全考虑的困扰,而犹豫要不要使用 Kubernetes 来开发应用程序,这些新的网络策略将会极大地解决你这方面的需求。并不需要等到 Kubernetes 1.3 ,现在就可以通过 ThirdPartyResource 的方式来使用这个实现性 API 。

如果你对 Kubernetes 和网络感兴趣,可以通过下面的方式参与、加入其中:

网络“特别兴趣小组”每两周下午三点(太平洋时间)开会,地址是SIG-Networking hangout.

–Chris Marino, Co-Founder, Pani Networks

  • Jan 1
  • Jan 1