Commit Graph

4316 Commits

Author SHA1 Message Date
Linus Torvalds
b2fe5fa686 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Significantly shrink the core networking routing structures. Result
    of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf

 2) Add netdevsim driver for testing various offloads, from Jakub
    Kicinski.

 3) Support cross-chip FDB operations in DSA, from Vivien Didelot.

 4) Add a 2nd listener hash table for TCP, similar to what was done for
    UDP. From Martin KaFai Lau.

 5) Add eBPF based queue selection to tun, from Jason Wang.

 6) Lockless qdisc support, from John Fastabend.

 7) SCTP stream interleave support, from Xin Long.

 8) Smoother TCP receive autotuning, from Eric Dumazet.

 9) Lots of erspan tunneling enhancements, from William Tu.

10) Add true function call support to BPF, from Alexei Starovoitov.

11) Add explicit support for GRO HW offloading, from Michael Chan.

12) Support extack generation in more netlink subsystems. From Alexander
    Aring, Quentin Monnet, and Jakub Kicinski.

13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
    Russell King.

14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.

15) Many improvements and simplifications to the NFP driver bpf JIT,
    from Jakub Kicinski.

16) Support for ipv6 non-equal cost multipath routing, from Ido
    Schimmel.

17) Add resource abstration to devlink, from Arkadi Sharshevsky.

18) Packet scheduler classifier shared filter block support, from Jiri
    Pirko.

19) Avoid locking in act_csum, from Davide Caratti.

20) devinet_ioctl() simplifications from Al viro.

21) More TCP bpf improvements from Lawrence Brakmo.

22) Add support for onlink ipv6 route flag, similar to ipv4, from David
    Ahern.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
  tls: Add support for encryption using async offload accelerator
  ip6mr: fix stale iterator
  net/sched: kconfig: Remove blank help texts
  openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
  tcp_nv: fix potential integer overflow in tcpnv_acked
  r8169: fix RTL8168EP take too long to complete driver initialization.
  qmi_wwan: Add support for Quectel EP06
  rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
  ipmr: Fix ptrdiff_t print formatting
  ibmvnic: Wait for device response when changing MAC
  qlcnic: fix deadlock bug
  tcp: release sk_frag.page in tcp_disconnect
  ipv4: Get the address of interface correctly.
  net_sched: gen_estimator: fix lockdep splat
  net: macb: Handle HRESP error
  net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
  ipv6: addrconf: break critical section in addrconf_verify_rtnl()
  ipv6: change route cache aging logic
  i40e/i40evf: Update DESC_NEEDED value to reflect larger value
  bnxt_en: cleanup DIM work on device shutdown
  ...
2018-01-31 14:31:10 -08:00
Linus Torvalds
3dbc4f5485 Merge branch 'next-seccomp' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull seccomp updates from James Morris:
 "Add support for retrieving seccomp metadata"

* 'next-seccomp' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  ptrace, seccomp: add support for retrieving seccomp metadata
  seccomp: hoist out filter resolving logic
2018-01-31 13:44:45 -08:00
Linus Torvalds
19e7b5f994 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "All kinds of misc stuff, without any unifying topic, from various
  people.

  Neil's d_anon patch, several bugfixes, introduction of kvmalloc
  analogue of kmemdup_user(), extending bitfield.h to deal with
  fixed-endians, assorted cleanups all over the place..."

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
  alpha: osf_sys.c: use timespec64 where appropriate
  alpha: osf_sys.c: fix put_tv32 regression
  jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path
  dcache: delete unused d_hash_mask
  dcache: subtract d_hash_shift from 32 in advance
  fs/buffer.c: fold init_buffer() into init_page_buffers()
  fs: fold __inode_permission() into inode_permission()
  fs: add RWF_APPEND
  sctp: use vmemdup_user() rather than badly open-coding memdup_user()
  snd_ctl_elem_init_enum_names(): switch to vmemdup_user()
  replace_user_tlv(): switch to vmemdup_user()
  new primitive: vmemdup_user()
  memdup_user(): switch to GFP_USER
  eventfd: fold eventfd_ctx_get() into eventfd_ctx_fileget()
  eventfd: fold eventfd_ctx_read() into eventfd_read()
  eventfd: convert to use anon_inode_getfd()
  nfs4file: get rid of pointless include of btrfs.h
  uvc_v4l2: clean copyin/copyout up
  vme_user: don't use __copy_..._user()
  usx2y: don't bother with memdup_user() for 16-byte structure
  ...
2018-01-31 09:25:20 -08:00
Linus Torvalds
26064ea409 Merge tag 'gfs2-4.16.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull GFS2 updates from Bob Peterson:
 "We've got 30 patches for this merge window. These generally fall into
  five categories:

   - code cleanups

   - patches related to adding PUNCH_HOLE support to GFS2

   - support for new fields in resource group headers

   - a few bug fixes

   - support for new fields in journal log headers. These new fields,
     which were previously unused, are designed to make it easier to
     track down file system corruption, and allow fsck.gfs2 to make more
     intelligent decisions when finding and fixing file system
     corruption.

  Details:

   - Two patches from Abhi Das, to trim the ordered writes list, which
     used to grow uncontrollably until unmount.

   - Several patches from Andreas Gruenbacher: remove an unused
     parameter from function gfs2_write_jdata_pagevec, remove a
     pointless BUG_ON, clean up an error patch in trunc_start, remove
     some unused parameters from truncate, make gfs2_journaled_truncate
     more efficient, clean up the support functions for truncate, fix
     metadata read-ahead for truncate to make it faster, fix up the
     non-recursive truncate code, rework and rename
     gfs2_block_truncate_page, generalize the non-recursive truncate
     code so it can take a range of values for punch_hole support,
     introduce new PUNCH_HOLE support that take advantage of the
     previous patches, add fallocate support with PUNCH_HOLE, fix some
     typos in the comments, add the function gfs2_max_stuffed_size to
     replace a piece of code that was needlessly repeated throughout
     GFS2, a minor cleanup to function gfs2_page_add_databufs, get rid
     of function gfs2_log_header_in in preparation for the new log
     header fields, and also fix up some missing newlines in kernel
     messages.

   - Andy Price added a new field to resource groups to indicate where
     the next one should be, to allow fsck.gfs2 to make better repairs.
     He also added new rindex fields for consistency checking, and added
     a crc field to resource group headers for consistency checking.

   - I reduced redundancy in functions common to freeing dinodes, and
     when writing log headers between the journalling code and journal
     recovery code. Also added new fields to journal log headers based
     on a prototype from Steve Whitehouse, and log the source of journal
     log headers so we can better track down journal corruption. Minor
     comment typo fix and a fix for a BUG in an unlink error path.

   - Steve Whitehouse contributed a patch to fix an incorrect use of the
     gfs2_blk2rgrpd function.

   - Tetsuo Handa contributed a patch that fixes incorrect error
     handling in function init_gfs2_fs"

* tag 'gfs2-4.16.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (30 commits)
  gfs2: Add a few missing newlines in messages
  gfs2: Remove inode from ordered write list in gfs2_write_inode()
  GFS2: Don't try to end a non-existent transaction in unlink
  GFS2: Fix minor comment typo
  GFS2: Log the reason for log flushes in every log header
  GFS2: Introduce new gfs2_log_header_v2
  gfs2: Get rid of gfs2_log_header_in
  gfs2: Minor gfs2_page_add_databufs cleanup
  gfs2: Add gfs2_max_stuffed_size
  gfs2: Typo fixes
  gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE)
  gfs2: Turn trunc_dealloc into punch_hole
  gfs2: Generalize truncate code
  Turn gfs2_block_truncate_page into gfs2_block_zero_range
  gfs2: Improve non-recursive delete algorithm
  gfs2: Fix metadata read-ahead during truncate
  gfs2: Clean up {lookup,fillup}_metapath
  gfs2: Remove minor gfs2_journaled_truncate inefficiencies
  gfs2: truncate: Remove unnecessary oldsize parameters
  gfs2: Clean up trunc_start error path
  ...
2018-01-31 08:55:58 -08:00
Linus Torvalds
efd52b5d36 Merge tag 'nfs-for-4.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable bugfixes:

   - Fix breakages in the nfsstat utility due to the inclusion of the
     NFSv4 LOOKUPP operation

   - Fix a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall()
     due to nfs_idmap_legacy_upcall() being called without an 'aux'
     parameter

   - Fix a refcount leak in the standard O_DIRECT error path

   - Fix a refcount leak in the pNFS O_DIRECT fallback to MDS path

   - Fix CPU latency issues with nfs_commit_release_pages()

   - Fix the LAYOUTUNAVAILABLE error case in the file layout type

   - NFS: Fix a race between mmap() and O_DIRECT

  Features:

   - Support the statx() mask and query flags to enable optimisations
     when the user is requesting only attributes that are already up to
     date in the inode cache, or is specifying the AT_STATX_DONT_SYNC
     flag

   - Add a module alias for the SCSI pNFS layout type

  Bugfixes:

   - Automounting when resolving a NFSv4 referral should preserve the
     RDMA transport protocol settings

   - Various other RDMA bugfixes from Chuck

   - pNFS block layout fixes

   - Always set NFS_LOCK_LOST when a lock is lost"

* tag 'nfs-for-4.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (69 commits)
  NFS: Fix a race between mmap() and O_DIRECT
  NFS: Remove a redundant call to unmap_mapping_range()
  pnfs/blocklayout: Ensure disk address in block device map
  pnfs/blocklayout: pnfs_block_dev_map uses bytes, not sectors
  lockd: Fix server refcounting
  SUNRPC: Fix null rpc_clnt dereference in rpc_task_queued tracepoint
  SUNRPC: Micro-optimize __rpc_execute
  SUNRPC: task_run_action should display tk_callback
  sunrpc: Format RPC events consistently for display
  SUNRPC: Trace xprt_timer events
  xprtrdma: Correct some documenting comments
  xprtrdma: Fix "bytes registered" accounting
  xprtrdma: Instrument allocation/release of rpcrdma_req/rep objects
  xprtrdma: Add trace points to instrument QP and CQ access upcalls
  xprtrdma: Add trace points in the client-side backchannel code paths
  xprtrdma: Add trace points for connect events
  xprtrdma: Add trace points to instrument MR allocation and recovery
  xprtrdma: Add trace points to instrument memory invalidation
  xprtrdma: Add trace points in reply decoder path
  xprtrdma: Add trace points to instrument memory registration
  ..
2018-01-30 19:03:48 -08:00
Linus Torvalds
168fe32a07 Merge branch 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull poll annotations from Al Viro:
 "This introduces a __bitwise type for POLL### bitmap, and propagates
  the annotations through the tree. Most of that stuff is as simple as
  'make ->poll() instances return __poll_t and do the same to local
  variables used to hold the future return value'.

  Some of the obvious brainos found in process are fixed (e.g. POLLIN
  misspelled as POLL_IN). At that point the amount of sparse warnings is
  low and most of them are for genuine bugs - e.g. ->poll() instance
  deciding to return -EINVAL instead of a bitmap. I hadn't touched those
  in this series - it's large enough as it is.

  Another problem it has caught was eventpoll() ABI mess; select.c and
  eventpoll.c assumed that corresponding POLL### and EPOLL### were
  equal. That's true for some, but not all of them - EPOLL### are
  arch-independent, but POLL### are not.

  The last commit in this series separates userland POLL### values from
  the (now arch-independent) kernel-side ones, converting between them
  in the few places where they are copied to/from userland. AFAICS, this
  is the least disruptive fix preserving poll(2) ABI and making epoll()
  work on all architectures.

  As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
  it will trigger only on what would've triggered EPOLLWRBAND on other
  architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
  at all on sparc. With this patch they should work consistently on all
  architectures"

* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
  make kernel-side POLL... arch-independent
  eventpoll: no need to mask the result of epi_item_poll() again
  eventpoll: constify struct epoll_event pointers
  debugging printk in sg_poll() uses %x to print POLL... bitmap
  annotate poll(2) guts
  9p: untangle ->poll() mess
  ->si_band gets POLL... bitmap stored into a user-visible long field
  ring_buffer_poll_wait() return value used as return value of ->poll()
  the rest of drivers/*: annotate ->poll() instances
  media: annotate ->poll() instances
  fs: annotate ->poll() instances
  ipc, kernel, mm: annotate ->poll() instances
  net: annotate ->poll() instances
  apparmor: annotate ->poll() instances
  tomoyo: annotate ->poll() instances
  sound: annotate ->poll() instances
  acpi: annotate ->poll() instances
  crypto: annotate ->poll() instances
  block: annotate ->poll() instances
  x86: annotate ->poll() instances
  ...
2018-01-30 17:58:07 -08:00
Linus Torvalds
0aebc6a440 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
 "The main theme of this pull request is security covering variants 2
  and 3 for arm64. I expect to send additional patches next week
  covering an improved firmware interface (requires firmware changes)
  for variant 2 and way for KPTI to be disabled on unaffected CPUs
  (Cavium's ThunderX doesn't work properly with KPTI enabled because of
  a hardware erratum).

  Summary:

   - Security mitigations:
      - variant 2: invalidate the branch predictor with a call to
        secure firmware
      - variant 3: implement KPTI for arm64

   - 52-bit physical address support for arm64 (ARMv8.2)

   - arm64 support for RAS (firmware first only) and SDEI (software
     delegated exception interface; allows firmware to inject a RAS
     error into the OS)

   - perf support for the ARM DynamIQ Shared Unit PMU

   - CPUID and HWCAP bits updated for new floating point multiplication
     instructions in ARMv8.4

   - remove some virtual memory layout printks during boot

   - fix initial page table creation to cope with larger than 32M kernel
     images when 16K pages are enabled"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (104 commits)
  arm64: Fix TTBR + PAN + 52-bit PA logic in cpu_do_switch_mm
  arm64: Turn on KPTI only on CPUs that need it
  arm64: Branch predictor hardening for Cavium ThunderX2
  arm64: Run enable method for errata work arounds on late CPUs
  arm64: Move BP hardening to check_and_switch_context
  arm64: mm: ignore memory above supported physical address size
  arm64: kpti: Fix the interaction between ASID switching and software PAN
  KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA
  KVM: arm64: Handle RAS SErrors from EL2 on guest exit
  KVM: arm64: Handle RAS SErrors from EL1 on guest exit
  KVM: arm64: Save ESR_EL2 on guest SError
  KVM: arm64: Save/Restore guest DISR_EL1
  KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
  KVM: arm/arm64: mask/unmask daif around VHE guests
  arm64: kernel: Prepare for a DISR user
  arm64: Unconditionally enable IESB on exception entry/return for firmware-first
  arm64: kernel: Survive corrected RAS errors notified by SError
  arm64: cpufeature: Detect CPU RAS Extentions
  arm64: sysreg: Move to use definitions for all the SCTLR bits
  arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
  ...
2018-01-30 13:57:43 -08:00
Linus Torvalds
af8c5e2d60 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Implement frequency/CPU invariance and OPP selection for
     SCHED_DEADLINE (Juri Lelli)

   - Tweak the task migration logic for better multi-tasking
     workload scalability (Mel Gorman)

   - Misc cleanups, fixes and improvements"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Make bandwidth enforcement scale-invariant
  sched/cpufreq: Move arch_scale_{freq,cpu}_capacity() outside of #ifdef CONFIG_SMP
  sched/cpufreq: Remove arch_scale_freq_capacity()'s 'sd' parameter
  sched/cpufreq: Always consider all CPUs when deciding next freq
  sched/cpufreq: Split utilization signals
  sched/cpufreq: Change the worker kthread to SCHED_DEADLINE
  sched/deadline: Move CPU frequency selection triggering points
  sched/cpufreq: Use the DEADLINE utilization signal
  sched/deadline: Implement "runtime overrun signal" support
  sched/fair: Only immediately migrate tasks due to interrupts if prev and target CPUs share cache
  sched/fair: Correct obsolete comment about cpufreq_update_util()
  sched/fair: Remove impossible condition from find_idlest_group_cpu()
  sched/cpufreq: Don't pass flags to sugov_set_iowait_boost()
  sched/cpufreq: Initialize sg_cpu->flags to 0
  sched/fair: Consider RT/IRQ pressure in capacity_spare_wake()
  sched/fair: Use 'unsigned long' for utilization, consistently
  sched/core: Rework and clarify prepare_lock_switch()
  sched/fair: Remove unused 'curr' parameter from wakeup_gran
  sched/headers: Constify object_is_on_stack()
2018-01-30 11:55:56 -08:00
Linus Torvalds
d8b91dde38 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side changes:

   - Clean up the x86 instruction decoder (Masami Hiramatsu)

   - Add new uprobes optimization for PUSH instructions on x86 (Yonghong
     Song)

   - Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian)

   - Fix misc bugs, update documentation, plus various cleanups (Jiri
     Olsa)

  There's a large number of tooling side improvements:

   - Intel-PT/BTS improvements (Adrian Hunter)

   - Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo)

   - Introduce an errno code to string facility (Hendrik Brueckner)

   - Various build system improvements (Jiri Olsa)

   - Add support for CoreSight trace decoding by making the perf tools
     use the external openCSD (Mathieu Poirier, Tor Jeremiassen)

   - Add ARM Statistical Profiling Extensions (SPE) support (Kim
     Phillips)

   - libtraceevent updates (Steven Rostedt)

   - Intel vendor event JSON updates (Andi Kleen)

   - Introduce 'perf report --mmaps' and 'perf report --tasks' to show
     info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)

   - Add infrastructure to record first and last sample time to the
     perf.data file header, so that when processing all samples in a
     'perf record' session, such as when doing build-id processing, or
     when specifically requesting that that info be recorded, use that
     in 'perf report --time', that also got support for percent slices
     in addition to absolute ones.

     I.e. now it is possible to ask for the samples in the 10%-20% time
     slice of a perf.data file (Jin Yao)

   - Allow system wide 'perf stat --per-thread', sorting the result (Jin
     Yao)

     E.g.:

      [root@jouet ~]# perf stat --per-thread --metrics IPC
      ^C
       Performance counter stats for 'system wide':

                  make-22229  23,012,094,032  inst_retired.any   #  0.8 IPC
                   cc1-22419     692,027,497  inst_retired.any   #  0.8 IPC
                   gcc-22418     328,231,855  inst_retired.any   #  0.9 IPC
                   cc1-22509     220,853,647  inst_retired.any   #  0.8 IPC
                   gcc-22486     199,874,810  inst_retired.any   #  1.0 IPC
                    as-22466     177,896,365  inst_retired.any   #  0.9 IPC
                   cc1-22465     150,732,374  inst_retired.any   #  0.8 IPC
                   gcc-22508     112,555,593  inst_retired.any   #  0.9 IPC
                   cc1-22487     108,964,079  inst_retired.any   #  0.7 IPC
       qemu-system-x86-2697       21,330,550  inst_retired.any   #  0.3 IPC
       systemd-journal-551        20,642,951  inst_retired.any   #  0.4 IPC
       docker-containe-17651       9,552,892  inst_retired.any   #  0.5 IPC
       dockerd-current-9809        7,528,586  inst_retired.any   #  0.5 IPC
                  make-22153  12,504,194,380  inst_retired.any   #  0.8 IPC
               python2-22429  12,081,290,954  inst_retired.any   #  0.8 IPC
      <SNIP>
               python2-22429  15,026,328,103  cpu_clk_unhalted.thread
                   cc1-22419     826,660,193  cpu_clk_unhalted.thread
                   gcc-22418     365,321,295  cpu_clk_unhalted.thread
                   cc1-22509     279,169,362  cpu_clk_unhalted.thread
                   gcc-22486     210,156,950  cpu_clk_unhalted.thread
      <SNIP>

           5.638075538 seconds time elapsed

     [root@jouet ~]#

   - Improve shell auto-completion of perf events (Jin Yao)

   - 'perf probe' improvements (Masami Hiramatsu)

   - Improve PMU infrastructure to support amp64's ThunderX2
     implementation defined core events (Ganapatrao Kulkarni)

   - Various annotation related improvements and fixes (Thomas Richter)

   - Clarify usage of 'overwrite' and 'backward' in the evlist/mmap
     code, removing the 'overwrite' parameter from several functions as
     it was always used it as 'false' (Wang Nan)

   - Fix/improve 'perf record' reverse recording support (Wang Nan)

   - Improve command line options documentation (Sihyeon Jang)

   - Optimize sample parsing for ordering events, where we don't need to
     parse all the PERF_SAMPLE_ bits, just the ones leading to the
     timestamp needed to reorder events (Jiri Olsa)

   - Generalize the annotation code to support other source information
     besides objdump/DWARF obtained ones, starting with python scripts,
     that will is slated to be merged soon (Jiri Olsa)

   - ... and a lot more that I failed to list, see the shortlog and
     changelog for details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
  perf trace beauty flock: Move to separate object file
  perf evlist: Remove fcntl.h from evlist.h
  perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY
  perf trace: Do not print from time delta for interrupted syscall lines
  perf trace: Add --print-sample
  perf bpf: Remove misplaced __maybe_unused attribute
  MAINTAINERS: Adding entry for CoreSight trace decoding
  perf tools: Add mechanic to synthesise CoreSight trace packets
  perf tools: Add full support for CoreSight trace decoding
  pert tools: Add queue management functionality
  perf tools: Add functionality to communicate with the openCSD decoder
  perf tools: Add support for decoding CoreSight trace data
  perf tools: Add decoder mechanic to support dumping trace data
  perf tools: Add processing of coresight metadata
  perf tools: Add initial entry point for decoder CoreSight traces
  perf tools: Integrating the CoreSight decoding library
  perf vendor events intel: Update IvyTown files to V20
  perf vendor events intel: Update IvyBridge files to V20
  perf vendor events intel: Update BroadwellDE events to V7
  perf vendor events intel: Update SkylakeX events to V1.06
  ...
2018-01-30 11:15:14 -08:00
Linus Torvalds
aca21de2e8 Merge tag 'm68k-for-v4.16-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:

  - first part of an overhaul of the NuBus subsystem, to bring it up to
    modern driver model standards

  - a race condition fix for Mac

  - defconfig updates

* tag 'm68k-for-v4.16-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  MAINTAINERS: Add NuBus subsystem entry
  m68k/mac: Fix race conditions in OSS interrupt dispatch
  nubus: Add support for the driver model
  nubus: Add expansion_type values for various Mac models
  nubus: Adopt standard linked list implementation
  nubus: Rename struct nubus_dev
  nubus: Rework /proc/bus/nubus/s/ implementation
  nubus: Generalize block resource handling
  nubus: Clean up whitespace
  nubus: Remove redundant code
  nubus: Call proc_mkdir() not more than once per slot directory
  nubus: Validate slot resource IDs
  nubus: Fix log spam
  nubus: Use static functions where possible
  nubus: Fix up header split
  nubus: Avoid array underflow and overflow
  m68k/defconfig: Update defconfigs for v4.15-rc1
2018-01-29 16:37:15 -08:00
Linus Torvalds
31466f3ed7 Merge tag 'for-4.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
 "Features or user visible changes:

   - fallocate: implement zero range mode

   - avoid losing data raid profile when deleting a device

   - tree item checker: more checks for directory items and xattrs

  Notable fixes:

   - raid56 recovery: don't use cached stripes, that could be
     potentially changed and a later RMW or recovery would lead to
     corruptions or failures

   - let raid56 try harder to rebuild damaged data, reading from all
     stripes if necessary

   - fix scrub to repair raid56 in a similar way as in the case above

  Other:

   - cleanups: device freeing, removed some call indirections, redundant
     bio_put/_get, unused parameters, refactorings and renames

   - RCU list traversal fixups

   - simplify mount callchain, remove recursing back when mounting a
     subvolume

   - plug for fsync, may improve bio merging on multiple devices

   - compression heurisic: replace heap sort with radix sort, gains some
     performance

   - add extent map selftests, buffered write vs dio"

* tag 'for-4.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (155 commits)
  btrfs: drop devid as device_list_add() arg
  btrfs: get device pointer from device_list_add()
  btrfs: set the total_devices in device_list_add()
  btrfs: move pr_info into device_list_add
  btrfs: make btrfs_free_stale_devices() to match the path
  btrfs: rename btrfs_free_stale_devices() arg to skip_dev
  btrfs: make btrfs_free_stale_devices() argument optional
  btrfs: make btrfs_free_stale_device() to iterate all stales
  btrfs: no need to check for btrfs_fs_devices::seeding
  btrfs: Use IS_ALIGNED in btrfs_truncate_block instead of opencoding it
  Btrfs: noinline merge_extent_mapping
  Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping
  Btrfs: extent map selftest: dio write vs dio read
  Btrfs: extent map selftest: buffered write vs dio read
  Btrfs: add extent map selftests
  Btrfs: move extent map specific code to extent_map.c
  Btrfs: add helper for em merge logic
  Btrfs: fix unexpected EEXIST from btrfs_get_extent
  Btrfs: fix incorrect block_len in merge_extent_mapping
  btrfs: Remove unused readahead spinlock
  ...
2018-01-29 14:04:23 -08:00
Linus Torvalds
0a4b6e2f80 Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
 "This is the main pull request for block IO related changes for the
  4.16 kernel. Nothing major in this pull request, but a good amount of
  improvements and fixes all over the map. This contains:

   - BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
     Paolo.

   - Support for SMR zones for deadline and mq-deadline from Damien and
     Christoph.

   - Set of fixes for bcache by way of Michael Lyle, including fixes
     from himself, Kent, Rui, Tang, and Coly.

   - Series from Matias for lightnvm with fixes from Hans Holmberg,
     Javier, and Matias. Mostly centered around pblk, and the removing
     rrpc 1.2 in preparation for supporting 2.0.

   - A couple of NVMe pull requests from Christoph. Nothing major in
     here, just fixes and cleanups, and support for command tracing from
     Johannes.

   - Support for blk-throttle for tracking reads and writes separately.
     From Joseph Qi. A few cleanups/fixes also for blk-throttle from
     Weiping.

   - Series from Mike Snitzer that enables dm to register its queue more
     logically, something that's alwways been problematic on dm since
     it's a stacked device.

   - Series from Ming cleaning up some of the bio accessor use, in
     preparation for supporting multipage bvecs.

   - Various fixes from Ming closing up holes around queue mapping and
     quiescing.

   - BSD partition fix from Richard Narron, fixing a problem where we
     can't mount newer (10/11) FreeBSD partitions.

   - Series from Tejun reworking blk-mq timeout handling. The previous
     scheme relied on atomic bits, but it had races where we would think
     a request had timed out if it to reused at the wrong time.

   - null_blk now supports faking timeouts, to enable us to better
     exercise and test that functionality separately. From me.

   - Kill the separate atomic poll bit in the request struct. After
     this, we don't use the atomic bits on blk-mq anymore at all. From
     me.

   - sgl_alloc/free helpers from Bart.

   - Heavily contended tag case scalability improvement from me.

   - Various little fixes and cleanups from Arnd, Bart, Corentin,
     Douglas, Eryu, Goldwyn, and myself"

* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
  block: remove smart1,2.h
  nvme: add tracepoint for nvme_complete_rq
  nvme: add tracepoint for nvme_setup_cmd
  nvme-pci: introduce RECONNECTING state to mark initializing procedure
  nvme-rdma: remove redundant boolean for inline_data
  nvme: don't free uuid pointer before printing it
  nvme-pci: Suspend queues after deleting them
  bsg: use pr_debug instead of hand crafted macros
  blk-mq-debugfs: don't allow write on attributes with seq_operations set
  nvme-pci: Fix queue double allocations
  block: Set BIO_TRACE_COMPLETION on new bio during split
  blk-throttle: use queue_is_rq_based
  block: Remove kblockd_schedule_delayed_work{,_on}()
  blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
  blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
  lib/scatterlist: Fix chaining support in sgl_alloc_order()
  blk-throttle: track read and write request individually
  block: add bdev_read_only() checks to common helpers
  block: fail op_is_write() requests to read-only partitions
  blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
  ...
2018-01-29 11:51:49 -08:00
Nicolas Dichtel
38e01b3056 dev: advertise the new ifindex when the netns iface changes
The goal is to let the user follow an interface that moves to another
netns.

CC: Jiri Benc <jbenc@redhat.com>
CC: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29 12:23:52 -05:00
David S. Miller
457740a903 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-01-26

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) A number of extensions to tcp-bpf, from Lawrence.
    - direct R or R/W access to many tcp_sock fields via bpf_sock_ops
    - passing up to 3 arguments to bpf_sock_ops functions
    - tcp_sock field bpf_sock_ops_cb_flags for controlling callbacks
    - optionally calling bpf_sock_ops program when RTO fires
    - optionally calling bpf_sock_ops program when packet is retransmitted
    - optionally calling bpf_sock_ops program when TCP state changes
    - access to tclass and sk_txhash
    - new selftest

2) div/mod exception handling, from Daniel.
    One of the ugly leftovers from the early eBPF days is that div/mod
    operations based on registers have a hard-coded src_reg == 0 test
    in the interpreter as well as in JIT code generators that would
    return from the BPF program with exit code 0. This was basically
    adopted from cBPF interpreter for historical reasons.
    There are multiple reasons why this is very suboptimal and prone
    to bugs. To name one: the return code mapping for such abnormal
    program exit of 0 does not always match with a suitable program
    type's exit code mapping. For example, '0' in tc means action 'ok'
    where the packet gets passed further up the stack, which is just
    undesirable for such cases (e.g. when implementing policy) and
    also does not match with other program types.
    After considering _four_ different ways to address the problem,
    we adapt the same behavior as on some major archs like ARMv8:
    X div 0 results in 0, and X mod 0 results in X. aarch64 and
    aarch32 ISA do not generate any traps or otherwise aborts
    of program execution for unsigned divides.
    Given the options, it seems the most suitable from
    all of them, also since major archs have similar schemes in
    place. Given this is all in the realm of undefined behavior,
    we still have the option to adapt if deemed necessary.

3) sockmap sample refactoring, from John.

4) lpm map get_next_key fixes, from Yonghong.

5) test cleanups, from Alexei and Prashant.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-28 21:22:46 -05:00
William Tu
fc1372f89f openvswitch: add erspan version I and II support
The patch adds support for openvswitch to configure erspan
v1 and v2.  The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr is added
to uapi as a binary blob to support all ERSPAN v1 and v2's
fields.  Note that Previous commit "openvswitch: Add erspan tunnel
support." was reverted since it does not design properly.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:39:43 -05:00
William Tu
d350a82302 net: erspan: create erspan metadata uapi header
The patch adds a new uapi header file, erspan.h, and moves
the 'struct erspan_metadata' from internal erspan.h to it.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:39:43 -05:00
Lawrence Brakmo
d44874910a bpf: Add BPF_SOCK_OPS_STATE_CB
Adds support for calling sock_ops BPF program when there is a TCP state
change. Two arguments are used; one for the old state and another for
the new state.

There is a new enum in include/uapi/linux/bpf.h that exports the TCP
states that prepends BPF_ to the current TCP state names. If it is ever
necessary to change the internal TCP state values (other than adding
more to the end), then it will become necessary to convert from the
internal TCP state value to the BPF value before calling the BPF
sock_ops function. There are a set of compile checks added in tcp.c
to detect if the internal and BPF values differ so we can make the
necessary fixes.

New op: BPF_SOCK_OPS_STATE_CB.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:15 -08:00
Lawrence Brakmo
a31ad29e6a bpf: Add BPF_SOCK_OPS_RETRANS_CB
Adds support for calling sock_ops BPF program when there is a
retransmission. Three arguments are used; one for the sequence number,
another for the number of segments retransmitted, and the last one for
the return value of tcp_transmit_skb (0 => success).
Does not include syn-ack retransmissions.

New op: BPF_SOCK_OPS_RETRANS_CB.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
44f0e43037 bpf: Add support for reading sk_state and more
Add support for reading many more tcp_sock fields

  state,	same as sk->sk_state
  rtt_min	same as sk->rtt_min.s[0].v (current rtt_min)
  snd_ssthresh
  rcv_nxt
  snd_nxt
  snd_una
  mss_cache
  ecn_flags
  rate_delivered
  rate_interval_us
  packets_out
  retrans_out
  total_retrans
  segs_in
  data_segs_in
  segs_out
  data_segs_out
  lost_out
  sacked_out
  sk_txhash
  bytes_received (__u64)
  bytes_acked    (__u64)

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
f89013f66d bpf: Add sock_ops RTO callback
Adds an optional call to sock_ops BPF program based on whether the
BPF_SOCK_OPS_RTO_CB_FLAG is set in bpf_sock_ops_flags.
The BPF program is passed 2 arguments: icsk_retransmits and whether the
RTO has expired.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
b13d880721 bpf: Adds field bpf_sock_ops_cb_flags to tcp_sock
Adds field bpf_sock_ops_cb_flags to tcp_sock and bpf_sock_ops. Its primary
use is to determine if there should be calls to sock_ops bpf program at
various points in the TCP code. The field is initialized to zero,
disabling the calls. A sock_ops BPF program can set it, per connection and
as necessary, when the connection is established.

It also adds support for reading and writting the field within a
sock_ops BPF program. Reading is done by accessing the field directly.
However, writing is done through the helper function
bpf_sock_ops_cb_flags_set, in order to return an error if a BPF program
is trying to set a callback that is not supported in the current kernel
(i.e. running an older kernel). The helper function returns 0 if it was
able to set all of the bits set in the argument, a positive number
containing the bits that could not be set, or -EINVAL if the socket is
not a full TCP socket.

Examples of where one could call the bpf program:

1) When RTO fires
2) When a packet is retransmitted
3) When the connection terminates
4) When a packet is sent
5) When a packet is received

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
de525be2ca bpf: Support passing args to sock_ops bpf function
Adds support for passing up to 4 arguments to sock_ops bpf functions. It
reusues the reply union, so the bpf_sock_ops structures are not
increased in size.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Jürg Billeter
e1fc742e14 fs: add RWF_APPEND
This is the per-I/O equivalent of O_APPEND to support atomic append
operations on any open file.

If a file is opened with O_APPEND, pwrite() ignores the offset and
always appends data to the end of the file. RWF_APPEND enables atomic
append and pwrite() with offset on a single file descriptor.

Signed-off-by: Jürg Billeter <j@bitron.ch>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-01-25 19:32:55 -05:00
David S. Miller
5ca114400d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
en_rx_am.c was deleted in 'net-next' but had a bug fixed in it in
'net'.

The esp{4,6}_offload.c conflicts were overlapping changes.
The 'out' label is removed so we just return ERR_PTR(-EINVAL)
directly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 13:51:56 -05:00
Bob Peterson
805c090750 GFS2: Log the reason for log flushes in every log header
This patch just adds the capability for GFS2 to track which function
called gfs2_log_flush. This should make it easier to diagnose
problems based on the sequence of events found in the journals.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-01-23 07:39:20 -07:00
Bob Peterson
c1696fb85d GFS2: Introduce new gfs2_log_header_v2
This patch adds a new structure called gfs2_log_header_v2 which is used
to store expanded fields into previously unused areas of the log headers
(i.e., this change is backwards compatible).  Some of these are used for
debug purposes so we can backtrack when problems occur.  Others are
reserved for future expansion.

This patch is based on a prototype from Steve Whitehouse.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-01-23 07:38:53 -07:00
David Decotigny
b2d3bcfa26 net: core: Expose number of link up/down transitions
Expose the number of times the link has been going UP or DOWN, and
update the "carrier_changes" counter to be the sum of these two events.
While at it, also update the sysfs-class-net documentation to cover:
carrier_changes (3.15), carrier_up_count (4.16) and carrier_down_count
(4.16)

Signed-off-by: David Decotigny <decot@googlers.com>
[Florian:
* rebase
* add documentation
* merge carrier_changes with up/down counters]
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22 15:42:05 -05:00
Sabrina Dubroca
e8660ded7f macsec: restore uAPI after addition of GCM-AES-256
Commit ccfdec9089 ("macsec: Add support for GCM-AES-256 cipher suite")
changed a few values in the uapi headers for MACsec.

Because of existing userspace implementations, we need to preserve the
value of MACSEC_DEFAULT_CIPHER_ID. Not doing that resulted in
wpa_supplicant segfaults when a secure channel was created using the
default cipher. Thus, swap MACSEC_DEFAULT_CIPHER_{ID,ALT} back to their
original values.

Changing the maximum length of the MACSEC_SA_ATTR_KEY attribute is
unnecessary, as the previous value (MACSEC_MAX_KEY_LEN, which was 128B)
is large enough to carry 32-bytes keys. This patch reverts
MACSEC_MAX_KEY_LEN to 128B and restores the old length check on
MACSEC_SA_ATTR_KEY.

Fixes: ccfdec9089 ("macsec: Add support for GCM-AES-256 cipher suite")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22 15:40:16 -05:00
Anand Jain
98820a7e24 btrfs: add support for SUPER_FLAG_CHANGING_FSID
The UUID change by btrfstune sets SUPER_FLAG_CHANGING_FSID and resets it
only when changing fsid is complete. Its not a good idea to mount the
device anything in between, reading metadata blocks would fail with UUID
mismatch.

This patch doesn't add SUPER_FLAG_CHANGING_FSID into
BTRFS_SUPER_FLAG_SUPP list, so mount will fail (along with the fix in
the next patch) when SUPER_FLAG_CHANGING_FSID is set.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22 16:08:21 +01:00
Anand Jain
e2731e5588 btrfs: define SUPER_FLAG_METADUMP_V2
btrfs-progs uses super flag bit BTRFS_SUPER_FLAG_METADUMP_V2 (1ULL << 34).
So just define that in kernel so that we know its been used.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22 16:08:21 +01:00
Anand Jain
ad8bc4d005 btrfs: put btrfs_ioctl_vol_args_v2 related defines together
Just a code spatial rearrangement, no functional change.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22 16:08:16 +01:00
David S. Miller
cbcbeedbfd Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for your net-next
tree. Basically, a new extension for ip6tables, simplification work of
nf_tables that saves us 500 LoC, allow raw table registration before
defragmentation, conversion of the SNMP helper to use the ASN.1 code
generator, unique 64-bit handle for all nf_tables objects and fixes to
address fallout from previous nf-next batch.  More specifically, they
are:

1) Seven patches to remove family abstraction layer (struct nft_af_info)
   in nf_tables, this simplifies our codebase and it saves us 64 bytes per
   net namespace.

2) Add IPv6 segment routing header matching for ip6tables, from Ahmed
   Abdelsalam.

3) Allow to register iptable_raw table before defragmentation, some
   people do not want to waste cycles on defragmenting traffic that is
   going to be dropped, hence add a new module parameter to enable this
   behaviour in iptables and ip6tables. From Subash Abhinov
   Kasiviswanathan. This patch needed a couple of follow up patches to
   get things tidy from Arnd Bergmann.

4) SNMP helper uses the ASN.1 code generator, from Taehee Yoo. Several
   patches for this helper to prepare this change are also part of this
   patch series.

5) Add 64-bit handles to uniquely objects in nf_tables, from Harsha
   Sharma.

6) Remove log message that several netfilter subsystems print at
   boot/load time.

7) Restore x_tables module autoloading, that got broken in a previous
   patch to allow singleton NAT hook callback registration per hook
   spot, from Florian Westphal. Moreover, return EBUSY to report that
   the singleton NAT hook slot is already in instead.

8) Several fixes for the new nf_tables flowtable representation,
   including incorrect error check after nf_tables_flowtable_lookup(),
   missing Kconfig dependencies that lead to build breakage and missing
   initialization of priority and hooknum in flowtable object.

9) Missing NETFILTER_FAMILY_ARP dependency in Kconfig for the clusterip
   target. This is due to recent updates in the core to shrink the hook
   array size and compile it out if no specific family is enabled via
   .config file. Patch from Florian Westphal.

10) Remove duplicated include header files, from Wei Yongjun.

11) Sparse warning fix for the NFPROTO_INET handling from the core
    due to missing static function definition, also from Wei Yongjun.

12) Restore ICMPv6 Parameter Problem error reporting when
    defragmentation fails, from Subash Abhinov Kasiviswanathan.

13) Remove obsolete owner field initialization from struct
    file_operations, patch from Alexey Dobriyan.

14) Use boolean datatype where needed in the Netfilter codebase, from
    Gustavo A. R. Silva.

15) Remove double semicolon in dynset nf_tables expression, from
    Luis de Bethencourt.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-21 11:35:34 -05:00
David S. Miller
ea9722e265 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-01-19

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) bpf array map HW offload, from Jakub.

2) support for bpf_get_next_key() for LPM map, from Yonghong.

3) test_verifier now runs loaded programs, from Alexei.

4) xdp cpumap monitoring, from Jesper.

5) variety of tests, cleanups and small x64 JIT optimization, from Daniel.

6) user space can now retrieve HW JITed program, from Jiong.

Note there is a minor conflict between Russell's arm32 JIT fixes
and removal of bpf_jit_enable variable by Daniel which should
be resolved by keeping Russell's comment and removing that variable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-20 22:03:46 -05:00
Linus Torvalds
24b6124047 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
 "ARM:
   - fix incorrect huge page mappings on systems using the contiguous
     hint for hugetlbfs
   - support alternative GICv4 init sequence
   - correctly implement the ARM SMCC for HVC and SMC handling

  PPC:
   - add KVM IOCTL for reporting vulnerability and workaround status

  s390:
   - provide userspace interface for branch prediction changes in
     firmware

  x86:
   - use correct macros for bits"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: wire up bpb feature
  KVM: PPC: Book3S: Provide information about hardware/firmware CVE workarounds
  KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and X86_CR4_PAE_BIT in kvm_valid_sregs()
  arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
  KVM: arm64: Fix GICv4 init when called from vgic_its_create
  KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
2018-01-20 11:41:09 -08:00
Christian Borntraeger
35b3fde620 KVM: s390: wire up bpb feature
The new firmware interfaces for branch prediction behaviour changes
are transparently available for the guest. Nevertheless, there is
new state attached that should be migrated and properly resetted.
Provide a mechanism for handling reset, migration and VSIE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[Changed capability number to 152. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-01-20 17:30:47 +01:00
Lorenzo Bianconi
4db5a802e5 l2tp: mark L2TP_ATTR_L2SPEC_LEN as not used
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:00:49 -05:00
Harsha Sharma
3ecbfd65f5 netfilter: nf_tables: allocate handle and delete objects via handle
This patch allows deletion of objects via unique handle which can be
listed via '-a' option.

Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-19 14:00:46 +01:00
Paul Mackerras
3214d01f13 KVM: PPC: Book3S: Provide information about hardware/firmware CVE workarounds
This adds a new ioctl, KVM_PPC_GET_CPU_CHAR, that gives userspace
information about the underlying machine's level of vulnerability
to the recently announced vulnerabilities CVE-2017-5715,
CVE-2017-5753 and CVE-2017-5754, and whether the machine provides
instructions to assist software to work around the vulnerabilities.

The ioctl returns two u64 words describing characteristics of the
CPU and required software behaviour respectively, plus two mask
words which indicate which bits have been filled in by the kernel,
for extensibility.  The bit definitions are the same as for the
new H_GET_CPU_CHARACTERISTICS hypercall.

There is also a new capability, KVM_CAP_PPC_GET_CPU_CHAR, which
indicates whether the new ioctl is available.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-19 15:17:01 +11:00
Jakub Kicinski
52775b33bb bpf: offload: report device information about offloaded maps
Tell user space about device on which the map was created.
Unfortunate reality of user ABI makes sharing this code
with program offload difficult but the information is the
same.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18 22:54:25 +01:00
Bob Peterson
786ebd9f68 Merge branch 'punch-hole' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git 2018-01-18 14:17:13 -07:00
Jesper Dangaard Brouer
cb5f7334d4 bpf: add comments to BPF ld/ldx sizes
Doc BPF ld/ldx size defines as comments in code, as it makes in
faster to lookup in a programming/review setting, than looking up
the sizes in Documentation/networking/filter.txt.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18 22:12:38 +01:00
David S. Miller
4f7d58517f Merge tag 'linux-can-next-for-4.16-20180116' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:

====================
pull-request: can-next 2018-01-16

this is a pull request for net-next/master consisting of 9 patches.

This is a series of patches, some of them initially by Franklin S Cooper
Jr, which was picked up by Faiz Abbas. Faiz Abbas added some patches
while working on this series, I contributed one as well.

The first two patches add support to CAN device infrastructure to limit
the bitrate of a CAN adapter if the used CAN-transceiver has a certain
maximum bitrate.

The remaining patches improve the m_can driver. They add support for
bitrate limiting to the driver, clean up the driver and add support for
runtime PM.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 16:08:25 -05:00
Jason Wang
aff3d70a07 tun: allow to attach ebpf socket filter
This patch allows userspace to attach eBPF filter to tun. This will
allow to implement VM dataplane filtering in a more efficient way
compared to cBPF filter by allowing either qemu or libvirt to
attach eBPF filter to tun.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 15:32:10 -05:00
Jiri Pirko
d47a6b0e7c net: sched: introduce ingress/egress block index attributes for qdisc
Introduce two new attributes to be used for qdisc creation and dumping.
One for ingress block, one for egress block. Introduce a set of ops that
qdisc which supports block sharing would implement.

Passing block indexes in qdisc change is not supported yet and it is
checked and forbidded.

In future, these attributes are to be reused for specifying block
indexes for classes as well. As of this moment however, it is not
supported so a check is in place to forbid it.

Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:57 -05:00
Jiri Pirko
7960d1daf2 net: sched: use block index as a handle instead of qdisc when block is shared
As the tcm_ifindex with value TCM_IFINDEX_MAGIC_BLOCK is invalid ifindex,
use it to indicate that we work with block, instead of qdisc.
So if tcm_ifindex is set to TCM_IFINDEX_MAGIC_BLOCK, tcm_parent is used
to carry block_index.

If the block is set to be shared between at least 2 qdiscs, it is
forbidden to use the qdisc handle to add/delete filters. In that case,
userspace has to pass block_index.

Also, for dump of the filters, in case the block is shared in between at
least 2 qdiscs, the each filter is dumped with tcm_ifindex value
TCM_IFINDEX_MAGIC_BLOCK and tcm_parent set to block_index. That gives
the user clear indication, that the filter belongs to a shared block
and not only to one qdisc under which it is dumped.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:57 -05:00
Ingo Molnar
7a7368a5f2 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-17 17:20:08 +01:00
David S. Miller
c02b3741eb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Overlapping changes all over.

The mini-qdisc bits were a little bit tricky, however.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 00:10:42 -05:00
David S. Miller
7018d1b3f2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-01-17

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add initial BPF map offloading for nfp driver. Currently only
   programs were supported so far w/o being able to access maps.
   Offloaded programs are right now only allowed to perform map
   lookups, and control path is responsible for populating the
   maps. BPF core infrastructure along with nfp implementation is
   provided, from Jakub.

2) Various follow-ups to Josef's BPF error injections. More
   specifically that includes: properly check whether the error
   injectable event is on function entry or not, remove the percpu
   bpf_kprobe_override and rather compare instruction pointer
   with original one, separate error-injection from kprobes since
   it's not limited to it, add injectable error types in order to
   specify what is the expected type of failure, and last but not
   least also support the kernel's fault injection framework, all
   from Masami.

3) Various misc improvements and cleanups to the libbpf Makefile.
   That is, fix permissions when installing BPF header files, remove
   unused variables and functions, and also install the libbpf.h
   header, from Jesper.

4) When offloading to nfp JIT and the BPF insn is unsupported in the
   JIT, then reject right at verification time. Also fix libbpf with
   regards to ELF section name matching by properly treating the
   program type as prefix. Both from Quentin.

5) Add -DPACKAGE to bpftool when including bfd.h for the disassembler.
   This is needed, for example, when building libfd from source as
   bpftool doesn't supply a config.h for bfd.h. Fix from Jiong.

6) xdp_convert_ctx_access() is simplified since it doesn't need to
   set target size during verification, from Jesper.

7) Let bpftool properly recognize BPF_PROG_TYPE_CGROUP_DEVICE
   program types, from Roman.

8) Various functions in BPF cpumap were not declared static, from Wei.

9) Fix a double semicolon in BPF samples, from Luis.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 22:42:14 -05:00
Arkadi Sharshevsky
56dc7cd0a8 devlink: Add relation between dpipe and resource
The hardware processes which are modeled via dpipe commonly use some
internal hardware resources. Such relation can improve the understanding
of hardware limitations. The number of resource's unit consumed per
table's entry are also provided for each table.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 14:15:34 -05:00
Arkadi Sharshevsky
2d8dc5bbf4 devlink: Add support for reload
Add support for performing driver hot reload.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 14:15:34 -05:00