Kubernetes is an open-source system that automates deployment, scaling, and management of applications run in containers. A container is a runtime environment that contains a software package and its dependencies. Kubernetes is often hosted in a cloud environment. The CTR provides recommended configuration and hardening guidance for setting up and securing a Kubernetes cluster.
CISA encourages users and administrators to review the updated Kubernetes Hardening Guide—which includes additional detail and explanations—and apply the hardening measures and mitigations to manage associated risks.
This article is contributed. See the original author and article here.
Healthcare continues to undergo transformation—it’s facing unprecedented challenges, new and complex expectations, and remarkable opportunities for innovation and growth. At the heart of this transformation are frontline healthcare workers—the doctors, nurses, and care team members that work to help keep us safe and healthy.
This article is contributed. See the original author and article here.
For many customers the concept of SAP virtual hostnames is commonly used and expected for the SAP landscape. It allows logically decouple the actual virtual machine hostname and an easy to understand and unique SAP system hostname. The hostname – often a cryptic string to comply with a company’s overall naming policy (example x17qus80) – is bound to the only network interface card. SAP system running on this host, let’s say with SID of T01, operates with a hostname unique for every instance. One virtual hostname for the ASCS instance, one for PAS, another for database and each application server has one each, too. Example sapt01ascs, sapt01pas, sapt01db etc.
These virtual hostnames correspond to DNS A/PTR entries and unique IP address is used for each virtual hostname, as seen in above example. These secondary IP addresses are bound to the same network interface card, so within the operating system the same NIC has – example above – 3 individual IP addresses. The virtual hostname concept hides the physical hostname from the application and allows very easy move of the SAP instances to new VMs – for example during major OS upgrades where typically a new VM with higher OS version is deployed, using new physical/VM hostname.
Figure 1 – Example of secondary IPs for SAP
NOTE: Please note that the configuration described in this article does NOT apply to any high-availability solution using internal load balancer such as Pacemaker or other, 3rd party cluster solutions. The load balancer fulfills the same role – virtual hostname IP is provided by the load balancer. The configuration described in this blog post can be applied to SAP Application Servers including (A)SCS, non-HA SAP systems, as well as 2-tier SAP systems (application, DB and SCS instances installed on a single VM.)
Now that you went through a quick refresher how virtual hostnames are used with SAP, how does this work in Azure? On-premises you add a secondary IP within the OS and often the networking stack automatically reconfigures. With software defined networking, as utilized in public cloud, some additional steps are needed.
An Azure VM running SAP workload typically requires only one NIC card, there is no performance benefit using multiple NICs. Hence, before adding the secondary IP within the OS, the IP needs to be added to the Azure network interface first. Each Azure NIC can support up to 256 secondary IP addresses.
To add the secondary IP(s) to a network interface you modify the network interface and add IP config(s) for each secondary entry. As always, you can use the Azure portal, CLI, tools like Terraform or the Azure API itself. Secondary IPs must be in the same subnet as the network interface itself.
There is NO downtime or interruption required to add or remove secondary IP addresses from Azure network cards, this can be done online.
Figure 2 – Example of secondary IPs configured in Azure for NIC
Once configured, as shown in picture, the Azure networking stack knows to route network packets for IPs 10.185.106.5, .60 and .61 to this network interface.
That is it from Azure perspective, make sure your DNS is updated with A/PTR records for this hostname/IP. Next step is the operating system of the VM.
OS specific steps for SAP virtual hostnames
Network packets addressed to the secondary IP(s) are now sent to this NIC and VM. But the OS must know how to deal with the network packets for the SAP virtual hostname/IP , too.
Most Linux distributions operate the NIC card with DHCP by default. That means Azure provides the IP (and possibly DNS servers, if configured on Azure virtual network) to the OS. SAP relevant OS images should pick-up the secondary IP address(es) that you configured within Azure immediately through DHCP. Don’t rely on DHCP entirely, however, since for some configurations like SAP LaMa or HA setups with Pacemaker require you to disable SUSE Network Manager and thus you need to add secondary IPs manually.
Should you use the default and use DHCP, once your DNS is configured correctly and the hostname entries are added to DNS, then there should be no action needed within the OS. You can try and reach the virtual hostname from other hosts – database host, from client PCs, etc.
virthost01:~ # ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 10.185.106.5/25 brd 10.185.106.127 scope global eth0
valid_lft forever preferred_lft forever
inet 10.185.106.60/25 brd 10.185.106.127 scope global secondary eth0
valid_lft forever preferred_lft forever
inet 10.185.106.61/25 brd 10.185.106.127 scope global secondary eth0
valid_lft forever preferred_lft forever
inet6 fe80::222:48ff:fe9b:ff71/64 scope link
valid_lft forever preferred_lft forever
Without the use of DHCP, you need to add the secondary IPs manually within the OS. To add them online, without any stopping of applications/OS, you can use the ip command
[root@virthost01 ~]# ip a add 10.185.106.60/25 dev eth0
[root@virthost01 ~]# ip a add 10.185.106.61/25 dev eth0
And verify (same result as if the IP is picked up with DHCP)
[root@virthost01 ~]# ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 10.185.106.4/25 brd 10.185.106.127 scope global noprefixroute eth0
valid_lft forever preferred_lft forever
inet 10.185.106.60/25 scope global secondary eth0
valid_lft forever preferred_lft forever
inet 10.185.106.61/25 scope global secondary eth0
valid_lft forever preferred_lft forever
inet6 fe80::20d:3aff:feb1:d482/64 scope link
valid_lft forever preferred_lft forever
Using ip command is however not persistent and after each reboot or VM start the secondary IPs in such case – if not automatically handled by dhcp client inside OS – would need to be re-added. For persistent changes modify /etc/sysconfig/network-scripts/ifcfg-<adapter> (RHEL) or /etc/sysconfig/network/ifcfg-<adapter> (SLES) scripts and restart networking or the OS.
Verify the name resolution
As clearly stated in SAP note 962955 – Use of virtual or logical TCP/IP host names both forward and reverse name lookups need to return the same hostname-IP pair. Failure often leads to SAP services not starting at all.
Now, in our example just below we will be working with this setup.
Hostname: x17qus80 (10.185.106.5)
SAP hostnames: sapt01ascs (10.185.106.60) and sapt01pas (10.185.106.61)
Hostname of local computer: x17qus80 (NiMyHostName)
FQHN of local computer: x17qus80.virthost.com (NiGetOwnFQDN)
Lookup of hostname: sapt01ascs (NiHostToAddr)
–> IP-Addr.: 10.185.106.60
Lookup of IP-Addr.: 10.185.106.60 (NiAddrToHost)
–> Hostname: sapt01ascs
As seen, both forward and reverse lookup return the correct values, everything is in order. Would the reverse lookup not work correctly or return a miss-matched hostname than forward name resolution, SAP message server and other services would not start.
Additional validation and troubleshooting step is to use nslookup directly for both forward and reverse lookups (hostname and IP).
Alternative – Using multiple NICs
First alternative architecture to using secondary IPs on a single NIC is to simply use more than one NIC. Very often we carry over our previous thinking and complicate system design. Using multiple NICs in Azure when its not required is one such typical example. Each NIC attached to an Azure VM must be in same virtual network but can be in different subnets. The network bandwidth limit of a virtual machine in Azure is set regardless how many network interfaces are attached, in other words you do NOT get additional network throughput with additional NICs.
Figure 3 – Example of an unsuitable solution for SAP virtual hostnames
The typical usage scenario is requirements to use network segmentation through subnets and thus dual/multi-heading VMs into two or more subnets, such as administrative and SAP-application subnet. Remember that each subnet within a vnet can talk directly, if routes and network security groups (NSGs) are not restricted.
Using multiple NICs for SAP virtual hostnames brings negatives of additional overhead for management, as each NIC requires set NSGs, user defined routes, manual routes and management within OS as dhcp clients in operating system typically only handles first NIC only you will need to set IP and routes for the second interface. It also increases complexity and more difficult troubleshooting.
Use of a single, well managed and configured (NSG, routing) network interface on a VM with secondary IP configurations/addresses as described above is the far better solution for SAP virtual hostnames, since SAP virtual hostname/IP(s) are in same subnet as actual VM. If however the additional overhead of multi-NIC is not seen as a burden, then this is a valid alternative.
Alternative – Using Azure Internal Load Balancer for secondary IP addresses
An Azure Internal Load Balancer (ILB) is the correct service to provide virtual/floating IPs in a highly available (clustered) architecture. The IP of such virtual hostnames, used by SAP, is bound to the Load Balancer which is attached to the NICs of the VMs. This is the recommended solution for such HA-systems.
Figure 4 – Example of single VM with ILB providing virtual hostname
For single-VMs a Load Balancer can be created, attached, and serve the same purpose as in HA setup or with secondary IPs on primary NIC. It is however not possible to setup a load balancer on secondary IP configurations for floating IP usage, so you cannot combine the two methods. As per the note at beginning of this blog post, load balancer should be used for any HA setup and not secondary IPs on a NIC.
The drawbacks to this architecture are higher costs, overhead of building, managing and troubleshooting when compared to architecture without ILB and using NIC with secondary IP configs attached to it. Additionally an Azure Standard ILB attached to a VM modifies how a VM can access Internet – for example OS update repositories. ILBs are typically only used for clustered systems due to these drawbacks.
Bad alternative – Using DNS alias for SAP virtual hostname
The last alternative architecture to secondary IPs on a NIC is the use of DNS aliases. Very often DNS aliases, also known as canonical names or CNAME entries, are used to resolve the SAP virtual hostname, with the alias pointing to the actual VM IP. The benefit of this is no additional IP address space is needed, as it’s merely DNS ‘magic’.
As an example, SAP system T01 is running on VM with a CNAME for SAP virtual hostname sapt01pas.
Figure 5 – Example of DNS CNAME for SAP virtual hostname
Resolving that alias hostname, reveals that it points to the VM hostname, thus 10.185.106.5 is returned as IP.
Hostname of local computer: x17qus80 (NiMyHostName)
FQHN of local computer: x17qus80 (NiGetOwnFQDN)
Lookup of hostname: sapt01pas (NiHostToAddr)
–> IP-Addr.: 10.185.106.5
Lookup of IP-Addr.: 10.185.106.5 (NiAddrToHost)
–> Hostname: x17qus80
The CNAME setup would fail for SAP with just DNS, as the reverse lookup for the IP address will return the VM hostname, not the SAP virtual hostname.
We can overcome this issue by either modifying PTR records – and thus likely breaking something else running on the VM like security scan tools – or through VM-local hosts file entry which are likely to cause problems at some point in (near) future.
A DNS alias solution is not recommended for SAP environments. As per afore mentioned SAP note 962955 – Use of virtual or logical TCP/IP host names “Do not use any TCP network alias names to virtualize physical host names within the intra-server communication.”
Off-topic 1: Azure VM name does not have to equal OS hostname
When you deploy a VM in Azure, an OS image – from Azure Marketplace or custom image – is used to provide the OS within. The OS, due to DHCP client, sets the hostname from Azure along with the IP. VM names (think Azure resource) and OS hostnames do not have to match!
However the VM itself – the Azure resource – can be much longer. 15 characters for Windows, up to 64 for Linux. Thus, you can leverage Azure naming for a more descriptive name and inside the VM, on the OS modify /etc/hostname and benefit from separating the two.
Off-topic 2: VM outbound IP for 3rd party applications
When we deal with SAP virtual hostnames, we think of incoming network traffic and name resolution from the perspective of the client/user or 3rd party application. The SAP server process thinks its hostname is something like sapt01pas and listens on communication coming in on that IP.
However, we often don’t think of egress traffic, how our SAP systems looks to the other application. By default, SAP answers back with the primary IP of the network interface. For some applications which do a reverse lookup (IP -> hostname) this might be a security issue. From its perspective it tries to talk with John (the SAP virtual hostname) but the response (IP traffic back) comes from an address where Peter (VM hostname) is registered at.
SAP has parameters for different services like message server, ICM, etc to set the IP to bind to and use for response. Some years ago, SAP introduced a ‘master’ parameter for SAP NetWeaver kernel to better and easier manage this. Parameter is/local_addr = <SAP virtual hostname> will ensure all outbound communication is using a specific IP – for details see SAP note 2157220 – Kernel parameter is/local_addr
Figure 6 – IP addresses used for inbound/outbound communication paths
With this parameter set, the SAP virtual hostname will be used for outbound communication as well.
Conclusion
SAP ‘s virtual hostname concept is very easy to use in Azure and there are different possibilities to utilize it. Changing and moving of SAP instances between VM, also used by SAP LaMa, can be done very quickly.
To recap some main topics in this blog post
You can have many secondary IPs bound to an Azure VM and its NIC
Azure handles networking and thus you need to make a change on the NIC, before the OS
Adding/removing secondary IPs on Azure and OS side can be done online, without a VM stop/deallocate or reboot
Do not use multiple NICs for virtual hostnames, as it adds layers of complexity. Secondary IPs are recommended instead.
Alternatively use ILBs for virtualizing hostnames/IP
DNS aliases are not recommended by SAP for virtual hostnames
This article is contributed. See the original author and article here.
My main advice when running performance benchmarks for Postgres is: “Automate it!”
If you’re measuring database performance, you are likely going to have to run the same benchmark over and over again. Either because you want a slightly different configuration, or because you realized you used some wrong settings, or maybe some other reason. By automating the way you’re running performance benchmarks, you won’t be too annoyed when this happens, because re-running the benchmarks will cost very little effort (it will only cost some time).
However, building this automation for the database benchmarks can be very time-consuming, too. So, in this post I’ll share the tools I built to make it easy to run benchmarks against Postgres—specifically against the Citus extension to Postgres running in a managed database service on Azure called Hyperscale (Citus) in Azure Database for PostgreSQL.
Here’s your map for reading this post: each anchor link takes you to a different section. The first sections explore the different types of application workloads and their characteristics, plus the off-the-shelf benchmarks that are commonly used for each. After that you can dive into the “how to” aspects of using HammerDB with Citus and Postgres on Azure. And yes, you’ll see some sample benchmarking results, too.
Why dive into the background on different workloads and database benchmarks first? Because there’s something that’s even more important than automating the way you run performance benchmarks: Choosing the right benchmark for you!
Different types of benchmarks for different types of workloads
Everyone that is using a database is using it for a different workload, because everyone has a different dataset and is running different queries. So, when comparing database performance, you will get the most accurate results by running a benchmark that’s based on your own workload. However, preparing a completely custom benchmark can be quite a bit of work.
So instead, you will likely want to run an off-the-shelf benchmark with a workload that is very similar to your own.
Benchmark specifications vs. full benchmark suites
There are two different ways in which an off-the-shelf benchmark can be provided to you:
Benchmark specification. In this case it is described how to run the benchmark in a document. It will tell you how to prepare the tables, how to load the data, and what queries to run. But you’re expected to do all this manually.
Full benchmark suite. In this case an application is provided to you which will run the benchmark. You configure the benchmarking application to run against your database server—and once it’s done running it spits out a few numbers to indicate how good the run was.
It’s obvious that a full benchmark suite is usually what you want since you can simply start the benchmarking application and get results. If you only have a benchmark specification, then you will first need to write tooling to run this specification against a database.
OLTP (Online Transaction Processing) workloads
A common workload category for databases is called OLTP (Online Transaction Processing). Workloads that fall in the OLTP category send lots of small, short-running queries (or transactions) to the database.
Some characteristics of OLTP workloads are:
Inserts, updates, and deletes only affect a single row. An example: Adding an item to a user’s shopping cart.
Read operations only read a handful of items from the database. An example: listing the items in a shopping cart for a user.
Aggregations are used rarely, and when they are used they are only used on small sets of data. Example: getting the total price of all items in a user their shopping cart.
The type of applications that create such a workload often have many concurrent users, that in aggregate do many requests per second. So for OLTP workloads, it’s important that the database can handle quite a lot of these queries at the same time. Response time of the application is usually also important, so the database queries should not take very long to run. Queries should always complete in less than a ~5 seconds, and most queries should complete in less than 100ms and probably even faster.
Well known database benchmarks that fall in the OLTP category are YCSB (full suite), TPC-C (specification), and HammerDB TPROC-C (full suite). There are two types of numbers that come out of these OLTP benchmarks that people are usually interested in:
Throughput in TPS (transactions per second)
Query latency, usually at different percentiles (p95, etc.)
OLAP (Online Analytical Processing) workloads
Another common database workload is called OLAP (Online Analytical Processing). This is the type of workload that is often run on data warehouses.
Some characteristics of OLAP workloads are:
Periodic batch inserts of data. New data is often added to the database from other systems in batches. This is usually done at specific times of the day when the database is not used by users, such as midnight in the local timezone.
Read operations often read large parts of the database. Common reasons for this are to answer questions from business analysts or have results that can be shown in quarterly shareholder meetings. A few examples of questions that require:
What are the top 10 most sold products of the last year?
How many new customers have joined in the last month?
How much revenue was generated by returning customers?
Aggregations are used in almost every query. Given that read operations read large parts of the database aggregations are necessary to make this data digestible by humans.
Queries are large and complex. To answer queries data often needs to be gathered from multiple different tables, or data needs to be compared to different data within the same table. The queries to gather and combine this data often use many of the features of SQL in a single query, such as JOINs, CTEs, subqueries, and window functions. Because they combine so many features, OLAP queries often become quite large and complex.
As opposed to OLTP, there are usually not a lot of concurrent users in an OLAP system. Usually only a single query—or only a few queries—are running at a time. Response times of these queries are also a lot higher than for OLTP workloads. OLAP queries usually take multiple seconds, or even minutes to complete. But of course database response times are still important in OLAP workloads, and waiting more than 20 minutes for a query result is often unacceptable.
Well known benchmarks that fall in the OLAP category are TPC-H (specification), TPC-DS (specification) and HammerDB TPROC-H (full suite). These benchmarks have a set of queries that use various SQL features and have different levels of complexity and number of JOINs.
There are two different results that an OLAP benchmark can give you:
How long it took to run all of the queries that are part of the benchmark
How long it took to run each of the queries, measured separately per query
Another database workload category is called HTAP (Hybrid transactional/analytical processing). This category contains workloads that combine aspects from both OLTP and OLAP workloads. So, there will be lots of active users doing small transactions, while at the same time running a few heavy long-running queries.
There’s only one well-known database benchmark that falls in the HTAP category and this benchmark is CH-benCHmark (specification). The CH-benCHmark benchmark specification describes running an OLTP benchmark (TPC-C), while also executing some queries from an OLAP benchmark on the same database. Creating a custom HTAP benchmark yourself is also possible, by running both an OLTP benchmark and an OLAP benchmark of your choice at the same time.
Challenge with benchmarking HTAP workloads
It can be quite hard to compare the numbers that come out of an HTAP benchmark across runs. This stems from the fact you will get two numbers per run of the benchmark and these numbers are often show an inverse correlation:
Throughput in TPS (transactions per second) for the OLTP part
Time it takes to run the analytical queries in seconds for the OLAP part
The problem is that as the number of transactions per second rises, the analytical queries will take longer to run. In other words, when TPS increases (good), then OLAP queries takes longer (bad). There are two reasons for this:
More TPS will often mean that the resources of the machine (cpu/disk) are more busy handling the OLTP queries. This has the side effect that these resources are available less often for the OLAP queries to use.
A certain percentage of OLTP transactions will insert data into the database. So higher TPS, means that the amount of data in the database will grow faster. Which in turn means that the OLAP queries will have to read more data, thus becoming slower.
The inverse correlation between these numbers makes it hard to conclusively say if one HTAP benchmark run has better results than another. You can only conclude that one is better if and only if both numbers are better. If one of the numbers is better, while the other is worse, then it becomes a matter of tradeoffs: It’s up to you to decide what you consider the most important factor for your workload: the number of OLTP transactions per second, or the time it takes to run the OLAP queries.
Figure 1: A table comparing different database workload types.
Dangers of comparing benchmark results you find online
Instead of running benchmarks yourself, it can be tempting to compare numbers published online by others. One thing to be careful of when comparing benchmarks run by others: There are many different ways to configure the benchmarks. So, comparing them is usually apples to oranges. A few of the differences that matter a lot are:
Is it running on production infrastructure? A lot more performance can usually be achieved when critical production features have been disabled. Things like backups, High Availability (HA) or security features (like TLS) can all impact performance.
How big is the dataset that was used? Does it fit in RAM or not? Reading from disk is much slower than reading from RAM. So, it matters a lot for the results of a benchmark if all the data fits in RAM.
Is the hardware excessively expensive? Obviously a database that costs $500 per month is expected to perform worse than one that costs $50,000 per month.
What benchmark implementation was used? Many vendors publish results of a TPC benchmark specification, where the benchmark was run using a custom implementation of the spec. These implementations have often not been validated and thus might not implement the specification correctly.
So, while it is easiest to compare database benchmark numbers you find online, you probably want to run your own benchmarks with your own data, too.
HammerDB TPROC-C for OLTP workloads
HammerDB is an easy to use open-source benchmarking suite for databases. HammerDB can be used to run an OLTP or an OLAP benchmark. The OLTP one is called TPROC-C[1] and is based on the TPC-C specification. The OLAP benchmark is called TPROC-H and is based on the TPC-H specification. HammerDB has implementations of these benchmarks for a lot of different databases, which makes it easy to compare results across different database types.
I have submitted several pull requests to HammerDB to improve the benchmark suite. One of these pull requests makes HammerDB TPROC-C work with the Citus extension to Postgres (therefore with distributed PostgreSQL). Two others greatly improved the speed at which the benchmark data is loaded into Postgres. All my pull requests have been accepted and were released in HammerDB 4.4. So, starting with HammerDB 4.4 you can run the HammerDB TPROC-C benchmark against Citus.
The main number that HammerDB gives you to compare across benchmark runs is called NOPM (new orders per minute). HammerDB uses NOPM instead of TPS (transactions per second), to make the number comparable between the different databases that HammerDB supports. The way that NOPM is measured is based on the tpmC metric from the official TPC-C specification—although in HammerDB, it is called NOPM instead of tpmC, because tpmC is technically used for official, fully audited benchmark results.)
How to benchmark Citus & Postgres on Azure with HammerDB, ARM, Bicep, tmux and cloud-init
Like I mentioned at the start, the most important thing when running benchmarks is to automate running them. In my experience you’re going to be re-running (almost) the same benchmark a lot!
That’s why I wanted to make running performance benchmarks with HammerDB against Postgres and Citus even easier than HammerDB already does on its own.
So, I created open source benchmark tooling (repo on GitHub) around HammerDB to make running benchmarks even easier—especially for the Citus extension to Postgres, running on Azure. When you use Postgres extensions, there are two layers of database software involved: you are running on both the Postgres database and also on the Postgres extension. So, the open source benchmarking automation I created for Citus runs benchmarks on the Hyperscale (Citus) option in the Azure Database for PostgreSQL managed service.
The benchmark tooling I created uses various things to make running benchmarks as easy as possible:
ARM templates in the Bicep format are used to provision all of the Azure resources need for the benchmark. It provisions the main thing you need: a Citus database cluster, specifically a Hyperscale (Citus) server group in Azure Database for PostgreSQL. But it also provisions a separate VM that’s used to run the benchmark program on—this VM is also called the “driver VM”.
Tmux is used to run the benchmark in the background. There is nothing worse than having to restart a 6 hour benchmark after 5 hours, only because your internet connection broke. Tmux resolves this by keeping the benchmark application running in the background even when you disconnect.
A cloud-init script is used to start the benchmark. The ARM template for the driver VM contains a cloud-init script that automatically starts the benchmark, once Postgres becomes reachable. That way you can just sit back and relax after you start the provisioning process. The benchmark will automatically start running in the background once the database and driver VM have been provisioned.
At the time of writing, the open source benchmark tooling I created supports running HammerDB TPROC-C (OLTP) and a custom implementation of the CH-benCHmark specification (HTAP). However, even if you want to run a different benchmark, the tooling I created will likely still be very useful to you. The only thing that you’d have to change to run another benchmark should be the section of the cloud-init script that installs and starts the benchmark. Feel free to send a PR to the repository to add support for another benchmark.
Tips about the Citus database configuration
Apart from automating your benchmarks there are a couple of Citus and Postgres related things that you should keep in mind when running benchmarks:
Don’t forget to distribute the Postgres tables! Most benchmarking tools don’t have built-in support for distributing Postgres tables with the Citus extension, so you will want to add some steps where you distribute the tables. If possible, it’s best to do this before loading the data, that way the data loading will be faster.
Choose the right distribution column. When distributing tables with Citus, it’s important to choose the right distribution column, otherwise performance can suffer. What the right distribution column is depends on the queries in the benchmark. Luckily, we have documentation with advice on choosing the right distribution column for you.
After building your dataset, run VACUUM ANALYZE on all your tables. Otherwise, Postgres statistics can be completely wrong, and you might get very slow query plans.
Be sure that your shard_count is a multiple of the number of workers that you have. Otherwise, the shards cannot be divided evenly across your workers, and some workers would get more load than others. A good default shard_count is 48 as the number 48 is divisible by a lot of numbers.
How to use the citus-benchmark tooling to run HammerDB benchmarks
Like I said, I tried to make running benchmarks as easy as possible. So, all you need to do is run this simple command (for detailed instructions check out the README in the “azure” directory):
# IMPORTANT NOTE: Running this command will provision 4 new Citus clusters
# and 4 times a 64-vCore driver VM in your Azure subscription. So, running
# the following command will cost you (or your employer) money!
azure/bulk-run.sh azure/how-to-benchmark-blog.runs | tee -a results.csv
The command above will start running HammerDB TPROC-C on a few different cluster sizes on production infrastructure for Hyperscale (Citus), a deployment option in the Azure Database for PostgreSQL managed service.[2] The results of these benchmark runs are all gathered in the results.csv file.
When you look at the newly created results.csv file, you’ll see strings that look like, for example, “c4+2w8”:
c4+2w8: This is simply a short way of saying that the cluster of that run has a 4 vCore coordinator (“c”), and 2 workers (“2w”), both with 8 vCores.
The total amount of cores present in the cluster is also shown in parenthesis.
Now that you have a csv file, you can use Excel (or whatever you prefer) to create a graph that looks roughly like this:
Figure 2: A graph comparing performance on differently-sized Hyperscale (Citus) database clusters in Azure Database for PostgreSQL. Performance is measured using the HammerDB TPROC-C benchmark in NOPM (new orders per minute) on the y-axis. And while these database servers are fairly small (only 8 cores per node), you can see the performance increases (higher NOPM is better) as more worker nodes are added to the Hyperscale (Citus) database clusters on Azure.
As you can see, NOPM keeps increasing when you add more workers to the Citus cluster. This shows that Citus delivers on the promise of scaling out: By simply adding more Citus nodes to the cluster in Azure Database for PostgreSQL, our performance goes up.
Getting to 2.0 million NOPM with larger Citus database clusters on Azure
The numbers in the graph above were gathered using relatively small Citus clusters. The main purpose of the chart is to show you how easy it is to get these numbers using HammerDB and the open source benchmarking tooling I created.
It’s possible to observe much higher benchmark results for Citus on Azure if you increase the number of vCores on each database node, and/or if you increase the total number of worker nodes in the Citus cluster. Higher performance with more vCores can be seen in our paper that was accepted at SIGMOD ’21. We used a coordinator and 8 workers with 16 cores and the NOPM in that paper was a lot higher.
Recently we also ran HammerDB TPROC-C on a very big Citus database cluster and got a whopping 2.0 million NOPM, using our regular managed service infrastructure on Azure.[3]
Some more details about this 2M NOPM HammerDB result:
Apart from using more worker nodes and more vCores per node than in my sample run earlier, there was one other thing that needed to be changed to achieve the 2M NOPM: HammerDB needed to be configured to use a lot more concurrent connections. The earlier sample benchmark run shown in Figure 2 above used 250 connections, but to keep this big cluster constantly busy I configured HammerDB to use 5000 connections.
The number of connections provided by default for Hyperscale (Citus) server groups in Azure Database for PostgreSQL depends on the coordinator size—and the maximum number of user connections is set by the system at 1000. To increase it, you just need to reach out to Azure support and request an increase in the maximum number of user connections to at least 5000 (a bit more is better to be on the safe side) on Postgres 14 for your Hyperscale (Citus) server group. So, creating a Hyperscale (Citus) cluster that can reproduce the 2M NOPM results is just a single support ticket away. After that you can simply use my benchmark tooling to run a benchmark against this cluster.
Have fun benchmarking your database performance
Comparing performance of databases or cloud providers can seem daunting. But with the knowledge and tools provided in this blog, benchmarking the database performance of Hyperscale (Citus) in Azure Database for PostgreSQL should be much easier. When running any performance benchmarks yourself, make sure that you:
Choose a benchmark that matches your workload. Does your workload fall into the OLTP, OLAP, or HTAP category?
And regardless of whether you’re looking to run your app on Citus open source in a self-managed way—or you’re looking to run your application on a managed service on Azure—it’s quite easy to get started with Citus to scale out Postgres.
Footnotes
If you’re first hearing of the benchmark name TPROC-C, you might think the “PROC” part of the name is because it uses stored “proc”edures. Not so! Rather, the name for the HammerDB workload TPROC-C means “Transaction Processing Benchmark derived from the TPC “C” specification”. More details here: https://www.hammerdb.com/docs/ch03s02.html. ↩︎
Postgres 14 and Citus 10.2 were used in this benchmark. All servers had 512GB storage and pgbouncer is not used. HammerDB TPROC-C was configured to run with 1000 warehouses and 250 virtual users. The “all warehouses” setting was turned on. ↩︎
Postgres 14 and Citus 10.2 were used in this benchmark. All servers had 2TB storage and pgbouncer was not used. HammerDB TPROC-C was configured to run with 5000 warehouses and 5000 virtual users. The “all warehouses” setting was turned on. A few Postgres settings were also changed from the defaults provided by Hyperscale (Citus), some of these require contacting support to change:
This article is contributed. See the original author and article here.
Service Bus (SB) client app fails with QuotaExceededException when sending messages to Session Enabled SB Queue, Subscription…
The exception will look something like this:
QuotaExceededException: The maximum entity size has been reached or exceeded for Queue: <SB Queue Name>. Size of entity in bytes:<Current Entity Size>, Max entity size in bytes: <Max Entity Size>.
QuotaExceededException is thrown when the message quota has been exceeded. The exception message further clarifies that the entity size has reachedexceeded the Max Limit.
You check Active, Dead lettered, Scheduled, Deferred, etc. messages in the entity. The size of all the messages is zero (or so small that it cannot add up to the max size of the entity).
You start wondering, what is causing the SB entity to fill up!
Cause:
The root cause of the SB entity filling up could be that SB client applications are not cleaning up the Session States. Session state remains as long as it isn’t cleared up (returning null), even if all messages in a session are consumed. The previously set session state can be cleared by passing null to the SetState method on the receiver.
The session state held in a queue or in a subscription counts towards that entity’s storage quota. When the application is finished with a session, it is therefore recommended for the application to clean up its retained state
You will see this scenario only with Session Enabled SB Entities i.e., Queue and Subscription.
Make sure that Active, Dead lettered, Scheduled, etc., messages are not filling the entity. Note: Do not try to second guess the size of an entity by the count of messages. You may have enabled Large Message Support
Using Service Bus Explorer Tool, you can peek into the entities to understand the message distribution. If this is your test environment, you can Receive and Delete all the messages, including Dead lettered messages, OR purge the messages, and see if the entity size significantly decreases.
The above checks indicate the unused session state is not being cleared. So please review the receiver application code to check Session State is cleared or not.
Mitigation & Resolution:
A proper solution to the Session State leak, filling entity, is to clean up the Session State by calling SetState using a null parameter.
There are times when customers may not be able to do these code changes immediately. In such a case they can temporarily increase the entity size, OR Delete the entity and recreate the entity with the same name. This mitigation should always be followed by the above-suggested code fix.
This article is contributed. See the original author and article here.
Microsoft.Data.SqlClient 5.0 Preview 1 has been released. This release contains improvements and updates to the Microsoft.Data.SqlClient data provider for SQL Server.
Our plan is to provide GA releases twice a year with two or three preview releases in between. This cadence should provide time for feedback and allow us to deliver features and fixes in a timely manner. This first 5.0 preview includes fixes and changes over the previous 4.0 GA release.
We appreciate the time and effort you spend checking out our previews. It makes the final product that much better. If you encounter any issues or have any feedback, head over to the SqlClient GitHub repository and submit an issue.
This article was originally posted by the FTC. See the original article here.
If you’re a college student, or a student in a vocational or certificate program, there are scams that specifically target you. This National Consumer Protection Week, we’ve been focused on how scams affect every community. We want to let you know about some scams that might affect you, but more importantly, hear from you about what you’re seeing.
Scammers often target students with scams related to jobs and making money. For example:
Fakecheck scams: These scams all involve someonesending you a check, asking you to deposit it, sending some of the money to someone else, and keeping the rest as payment. The scams that target students often involve jobs you could do on the side — so being a mystery shopper, advertising with a car wrap, working as a part-time assistant or dog walker for someone pretending to be your professor. Except, those “jobs” are all fake, and that check they gave you? It’s going to bounce and when it does, and the bank realizes the check was fake, it will want that money back.
Cryptocurrency investment scams: As we wrote about last May, people in their 20s and 30s have lost a lot of money to investment scams, and many of those losses have been in cryptocurrency. These scams can involve fake investment sites, using celebrities and false promises that you can multiply your money, or using online dating sites to sweet-talk you into fake crypto investments.
Every report helps us in our mission to protect every community from scammers. When you report to us, it gets shared with over 3,000 law enforcers — and it helps the FTC spread the word so others can avoid scams.
Join the NCPW conversation on social media: #NCPW2022 and see what’s up at ftc.gov/ncpw.
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.
This article is contributed. See the original author and article here.
CISA is aware of a privilege escalation vulnerability in Linux kernel versions 5.8 and later known as “Dirty Pipe” (CVE-2022-0847). A local attacker could exploit this vulnerability to take control of an affected system.
CISA encourages users and administrators to review (CVE-2022-0847) and update to Linux kernel versions 5.16.11, 5.15.25, and 5.10.102 or later.
Recent Comments